“Data science is doable,” a fellow attendee of the EPIC’s 2018 conference in Honolulu would exclaim like a mantra. The conference was for business ethnographers and UX researchers interested in understanding and integrating data science and machine learning into their research. She was specifically trying to address a tendency she has noticed– which I have seen as well: qualitative researchers and other so-called “non-math people” frequently believe that data science is far too technical for them. This seems ultimately rooted in cultural myths about math and math-related fields like computer science, engineering, and now data science, and in a similar vein as her statement, my goal in this essay is to discuss these attitudes and show that data science, like math, is relatable and doable if you treat it as such.
The “Math
Person”
In the United States, many possess an implied image of a “math person:” a person supposedly naturally gifted at mathematics. And many who do not see themselves as fitting that image simply decry that math simply isn’t for them. The idea that some people are inherently able and unable to do math is false, however, and prevents people from trying to become good at the discipline, even if they might enjoy and/or excel at it.
Most skills in life, including mathematical skills, are like muscles: you do not innately possess or lack that skill, but rather your skill develops as you practice and refine that activity. Anybody can develop a skill if they practice it enough.
Scholars in anthropology, sociology, psychology, and education have documented how math is implicitly and explicitly portrayed as something some people can do and some cannot do, especially in math classes in grade school. Starting in early childhood, we implicitly and sometimes explicitly learn the idea that some people are naturally gifted at math but for others, math is simply not their thing. Some internalize that they are gifted at math and thus take the time to practice enough to develop and refine their mathematical skills; while others internalize that they cannot do math and thus their mathematical abilities become stagnant. But this is simply not true.
Anyone can learn and do math if he or she practices math and cultivates mathematical thinking. If you do not cultivate your math muscle, then well it will become underdeveloped and, then, yes, math becomes harder to do. Thus, as a cruel irony someone internalizing that he or she cannot do math can turn into a self-fulfilling prophecy: he or she gives up on developing mathematical skills, which leads to its further underdevelopment.
Similarly, we cultivate another false myth that people skilled in mathematics (or math-related fields like computer science, engineering, and data science) in general do not possess strong social and interpersonal communication skills. The root for this stereotype lies in how we think of mathematical and logical thinking than actual characteristics of mathematicians, computer scientists, or engineers. Social scientists who have studied the social skills of mathematicians, computer scientists, and engineers have found no discernable difference in social and interpersonal communication skills with the rest of the world.
Quantitative and Qualitative Specialties
Anyone can learn and do math if he or she practices math and cultivates mathematical thinking.
The belief that some people are just inherently good at math and that such people do not possess strong social and interpersonal communication skills contributes to the division between quantitative and qualitative social research, in both academic and professional contexts. These attitudes help cultivate the false idea that quantitative research and qualitative research are distinct skill sets for different types of people: that supposedly quantitative research can only be done “math people” and qualitative research by “people people.” They suddenly become separate specialties, even though social research by its very nature involves both. Such a split unnecessarily stifles authentic and holistic understanding of people and society.
In professional and business research contexts, both qualitative and quantitative researchers should work with each other and eventually through that process, slowly learn each other’s skills. If done well, this would incentivize researchers to cultivate both mathematical/quantitative, and interpersonal/qualitative research skills.
It would reward professional researchers who develop both skillsets and leverage them in their research, instead of encouraging researchers to specialize in one or the other. It could also encourage universities to require in-depth training of both to train their students to become future workers, instead of requiring that students choose among disciplines that promote one track over the other.
Working together is
only the first step, however, whose success hinges on whether it ultimately
leads to the integration of these supposedly separate skillsets. Frequently,
when qualitative and quantitative research teams work together, they work mostly
independently – qualitative researchers on the qualitative aspect of the
project and quantitative researchers on the quantitative aspects of the project
– thus reinforcing the supposed distinction between them. Instead, such
collaboration should involve qualitative researchers developing quantitative
research skills by practicing such methods and quantitative researchers similarly
developing qualitative skills.
Conclusion
Anyone can develop mathematics and data science skills if they practice at it. The same goes with the interpersonal skills necessary for ethnographic and other qualitative research. Depicting them as separate specialties – even if they come together to do each of their specialized parts in a single research projects – functions stifles their integration as a singular set of tools for an individual and reinforces the false myths we have been teaching ourselves that data science is for math, programming, or engineering people and that ethnography is for “people people.” This separation stifles holistic and authentic social research, which inevitably involves qualitative and quantitative approaches.
What is ethnography, and how has it been used in the professional world? This article is a quick and dirty crash course for someone who has never heard of (or knows little about) ethnography.
Anthropology
at its most basic is the study of human cultures and societies. Cultural anthropologists generally seek
to understand current cultures and societies by conducting ethnography.
In short, ethnography involves seeking to understand the lived experiences of a particular culture, setting, group, or other context by some combination of being with those in that context (called participant-observation), interviewing or talking with them, and analyzing what happens and what is produced in that context.
It is an umbrella term for a set of methods (including participant-observation, interviews, group interviews or focus groups, digital recording, etc.) employed with that goal, and most ethnographic projects use some subset of these methods given the needs of the specific project. In this sense, it is similar to other umbrella methodologies – like statistics – in that it encapsulates a wide array of different techniques depending on the context.
One conducts ethnographic research to understand something about the lived experiences of a context. In the professional world, for example, ethnography is frequently useful in the following contexts:
Market Research: When trying to understand customers and/or users in-depth
Product Design: When trying to design or modify a product by seeing how people use it in action
Organizational Communication and Development: When trying to understand a “people problem” within an organization.
In this article, I expound in more detail on situations where ethnographic research is useful in in professional settings.
Ethnographies are best understood through examples, so the table below include excellent example ethnographies and ethnographic researchers in various industries/fields:
These, of course, are not the only some situations where ethnography might be helpful. Ethnography is a powerful tool to develop a deep understanding of others’ experiences and to develop innovative and strategic insights.
I am pleased to announce that the Annals of Anthropological Practice has accepted my article “Anthropology by Data Science.” https://anthrosource.onlinelibrary.wiley.com/doi/10.1111/napa.12169. In it, I reflect on the relationship anthropologist have cultivated with data science as a discipline and the importance of integrating machine learning techniques into ethnographic practice.
This is a quick and dirty summary of my master’s practicum research project with Indicia Consulting over the summer of 2018. For anyone interested in more detail, here is a more detailed report, and here is the final report with Indicia.
Background
My practicum was the sixth stage of a several year-long research project. The California Energy Commission commissioned this larger project to understand the potential relationship between individual energy consumption and technology usage. In stages one through five, we isolated certain clusters of behavior and attitudes around new technology adoption – which Indicia called cybersensitivity – and demonstrated that cybersensitivity tended to associate with a willingness to adopt energy-saving technology like smart meters.
This led to a key question: How can one identify cybersensivity among a broader population such as a community, county, or state? Answering this question was the main goal of my practicum project.
In the past stages of the research project, the team used ethnographic research to establish criteria for whether someone was a cybersensitive based on several hours of interviews and observations about their technology usage. These interviews and observations certainly helped the research team analyze behavioral and attitudinal patterns, determine what patterns were significant, and develop those into the concept of cybersensitivity, but they are too time- and resource-intensive to perform with an entire population. One generally does not have the ability to interview everyone in a community, county, or state. I sought to address this directly in my project.
Task
Timeline
Task Name
Research Technique
Description
Task 1
June 2015-Sept 2018
General Project Tasks
Administrative (N/A)
Developed project scope and timeline, adjusting as the project unfolds
Task 2
July 2015 – July 2016
Documenting and analyzing emerging attitudes, emotions, experiences, habits, and practices around technology adoption
Survey
Conducted survey research to observe patterns of attitudes and behaviors among cybersensitives/awares.
Task 3
Sept 2016 – Dec 2016
Identifying the attributes and characteristics and psychological drivers of cybersensitives
Interviews and Participant-Observation
Conducted in-depth interviews and observations coding for psych factor, energy consumption attitudes and behaviors, and technological device purchasing/usage.
Task 4*
Sept 2016 – July 2017
Assessing cybersensitives’ valence with technology
Statistical Analysis
Tested for statistically significant differences in demographics, behaviors, and beliefs/attitudes between cyber status groups
Task 5
Aug 2017 – Dec 2018
Developing critical insights for supporting residential engagement in energy efficient behaviors
Statistical Analysis
Analyzed utility data patterns of study participants, comparing it with the general population.
Task 6
March 2018 – Aug 2018
Recommending an alternative energy efficiency potential model
Decision Tree Modeling
Constructed decision tree models to classify an individual’s cyber status
Project Goal
The overall goal for the project was to produce a scalable method to assess whether someone exhibits cybersensitivity based on data measurable across an entire population. In doing this, the project also helped address the following research needs:
Created a method to further to scale across a larger population, assessing whether cybersensitives were more willing to adopt energy saving technologies across a community, county, or state
Provided the infrastructure to determine how much promoting energy-saving campaigns targeting cybersensitives specifically would reduce energy consumption in California
Helped the California Energy Commission determine the best means to reach cybersensitives for specific energy-saving campaigns
The Project
I used machine learning modeling to create a decision-making flow to isolate cybersensitives in a population. Random forests and decision trees produced the best models for Indicia’s needs: random forests in accuracy and robustness and decision trees in human decipherability. Through them, I created a programmable yet human-comprehensible framework to determine whether an individual is cybersensitive based on behaviors and other characteristics that an organization could be easily assess within a whole population. Thus, any energy organization could easily understand, replicate, and further develop the model since it was both easy for humans to read and encodable computationally. This way organizations could both use and refine it for their purposes.
Conclusion
This is a quick overview of my master’s practicum project. For more details on what modeling I did, how I did it, what results it produced, and how it fit within the wider needs of the multi-year research project, please see my full report.
I really appreciated the opportunity it posed to get my hands dirty integrating ethnography and data science to help address a real-world problem. This summary only scratches the surface of what Indicia did with the Californian Energy Commission to encourage sustainable energy usage societally. Hopefully, though, it will inspire you to integrate ethnography and data science to address whatever complex questions you face. It certainly did for me.
Thank you to Susan Mazur-Stommen and Haley Gilbert for your help in organizing and completing the project. I would like to thank my professorial committee at the University of Memphis – Dr. Keri Brondo, Dr. Ted Maclin, Dr. Deepak Venugopal, and Dr. Katherine Hicks – for their academic support as well.
In the spring of 2018, I researched how anthropologists and related social scholars have analyzed data science and machine learning for my Master’s in Anthropology at the University of Memphis. For the project, I assessed the anthropological literature on data science and machine learning to date and explore potential connections between anthropology and data science, based on my perspective as a data scientist and anthropologist. Here is my final report.
Thank you, Dr. Ted Maclin, for your help overseeing and assisting this project.
This is my practicum report with Indicia Consulting. In lieu of a master’s thesis, the University of Memphis Department of Anthropology required that we master’s students conduct a practicum project. For this, we had to partner with an organization and complete a 300+ hour anthropological research project based on the organization’s needs and our skills and interests. My practicum project was Indicia’s EPIC Project with the California Energy Commission (see this link and this link for more details on the EPIC Project). In this report, I outline potential ways to integrate ethnographic/anthropological and data science research in professional settings.
In November 2019, the American Anthropological Association’s Committee for the Anthropology of Science, Technology, and Computing (CASTAC) awarded me the David Hakken Graduate Student Prize for innovative science and technology scholarship.
The Anthropology Department also required that you publicly present your practicum research to the University of Memphis campus. This PowerPoint summarizes my practicum project. If you are not keen to read the 99 page full report, this is a much shorter alternative:
The following is a presentation I gave at the Society for Applied Anthropology’s 2018 annual conference in Philadelphia, PA. In it, I describe how I think anthropologists should understand, analyze, and relate to machine learning and data science.
Below is a talk I gave at the 2019 Memphis Data conference, organized by the University of Memphis to discuss data science research in the Memphian community. In this presentation, I summarize a project I did with Indicia Consulting that integrated data science and ethnography.