I recently organized a professional group called EPIC Data Scientists + Ethnographers along with a few others who are both data scientists and ethnographers. Our goal is to form a virtual community to discuss ways to incorporate ethnography and data science, just like I strive to do on this website.
If you are interested in working with others on this or simply interested in learning more, feel free to join. Whether you are both a data scientist and ethnographer, only one of them, or neither, we would love to hear your perspective.
Thank you, EPIC, for helping to develop this and giving us a platform.
In a previous article, I have discussed the value of integrating data science and ethnography. On LinkedIn, people commented that they were interested and wanted to hear more detail on potential ways to do this. I replied, “I have found explaining how to conduct studies that integrate the two practically is easier to demonstrate through example than abstractly since the details of how to do it vary based on the specific needs of each project.”
In this article, I intend to do exactly that: analyze four innovative projects that in some way integrated data science and ethnography. I hope these will spur your creative juices to help think through how to creatively combine them for whatever project you are working on.
Synopsis:
Project:
How It Integrated Data Science and Ethnography:
Link to Learn More:
No Show Model
Used ethnography to design machine learning software
Used ethnography to understand how users make sense of and behave towards a machine learning system they encounter and how this, in turn, shapes the development of the machine learning algorithm(s)
A medical clinic at a hospital system in New York City asked me to use machine learning to build a show rate predictor in order to inform an improve its scheduling practices. During the initial construction phase, I used ethnography to both understand in more depth understand the scheduling problem the clinic faced and determine an appropriate interface design.
Through an ethnographic inquiry, I discovered the most important question(s) schedulers ask when scheduling their appointments. This was, “Of the people scheduled for a given doctor on a particular day, how many of them are likely to actually show up?” I then built a machine learning model to answer this exact question. My ethnographic inquiry provided me the design requirements for the data science project.
In addition, I used my ethnographic inquiries to design the interface. I observed how schedulers interacted with their current scheduling software, which gave me a sense for what kind of visualizations would work or not work for my app.
This project exemplifies how ethnography can be helpful both in the development stage of a machine learning project to determine machine learning algorithm(s) needs and on the frontend when communicating the algorithm(s) to and assessing its successfulness with its users.
As both an ethnographer and a data scientist, I was able to translate my ethnographic insights seamlessly into machine learning modeling and API specifications and also conducted follow-up ethnographic inquiries to ensure that what I was building would meet their needs.
Project 2: Cybersensitivity Study
I conducted this project with Indicia Consulting. Its goal was to explore potential connections between individuals’ energy consumption and their relationship with new technology. This is an example of using ethnography to explore and determine potential social and cultural patterns in-depth with a few people and then using data science to analyze those patterns across a large population.
We started the project by observing and interviewing about thirty participants, but as the study progressed, we needed to develop a scalable method to analyze the patterns across whole communities, counties, and even states.
Ethnography is a great tool for exploring a phenomenon in-depth and for developing initial patterns, but it is resource-intensive and thus difficult to conduct on a large group of people. It is not practical for saying analyzing thousands of people. Data science, on the other hand, can easily test the validity across an entire population of patterns noticed in smaller ethnographic studies, yet because it often lacks the granularity of ethnography, would often miss intricate patterns.
Ethnography is also great on the back end for determining whether the implemented machine learning models and their resulting insights make sense on the ground. This forms a type of iterative feedback loop, where data science scales up ethnographic insights and ethnography contextualizes data science models.
Thus, ethnography and data science cover each other’s weaknesses well, forming a great methodological duo for projects centered around trying to understand customers, users, colleagues, or other users in-depth.
Project 3: Facebook Newsfeed Folk Theories
In their study, Motahhare Eslami and her team of researchers conducted an ethnographic inquiry into how various Facebook users conceived of how the Facebook Newsfeed selects which posts/stories rise to the top of their feeds. They analyze several different “folk theories” or working theories by everyday people for the criteria this machine learning system uses to select top stories.
How users think the overall system works influences how they respond to the newsfeed. Users who believe, for example, that the algorithm will prioritize the posts of friends for whom they have liked in the past will often intentionally like the posts of their closest friends and family so that they can see more of their posts.
Users’ perspectives on how the Newsfeed algorithm works influences how they respond to it, which, in turn, affects the very data the algorithm learns from and thus how the algorithm develops. This creates a cyclic feedback loop that influences the development of the machine learning algorithmic systems over time.
Their research exemplifies the importance of understanding how people think about, respond to, and more broadly relate with machine learning-based software systems. Ethnographies into people’s interactions with such systems is a crucial way to develop this understanding.
In a way, many machine learning algorithms are very social in nature: they – or at least the overall software system in which they exist – often succeed or fail based on how humans interact with them. In such cases, no matter how technically robust a machine learning algorithm is, if potential users cannot positively and productively relate to it, then it will fail.
Ethnographies into the “social life” of machine learning software systems (by which I mean how they become a part of – or in some cases fail to become a part of – individuals’ lives) helps understand how the algorithm is developing or learning and determine whether they are successful in what we intended them to do. Such ethnographies require not only in-depth expertise in ethnographic methodology but also an in-depth understanding how machine learning algorithms work to in turn understand how social behavior might be influencing their internal development.
Project 4: Thing Ethnography
Elise Giaccardi and her research team have been pioneering the utilization of data science and machine learning to understand and incorporate the perspective of things into ethnographies. With the development of the internet of things (IOT), she suggests that the data from object sensors could provide fresh insights in ethnographies of how humans relate to their environment by helping to describe how these objects relate to each other. She calls this thing ethnography.
This experimental approach exemplifies one way to use machine learning algorithms within ethnographies as social processes/interactions in of themselves. This could be an innovative way to analyze the social role of these IOT objects in daily life within ethnographic studies. If Eslami’s work exemplifies a way to graft ethnographic analysis into the design cycle of machine learning algorithms, Giaccardi’s research illustrates one way to incorporate data science and machine learning analysis into ethnographies.
Conclusion
Here are four examples of innovative projects that involve integrating data science and ethnography to meet their respective goals. I do not intend these to be the complete or exhaustive account of how to integrate these methodologies but as food for thought to spur further creative thinking into how to connect them.
For those who, when they hear the idea of integrating data science and ethnography, ask the reasonable question, “Interesting but what would that look like practically?”, here are four examples of how it could look. Hopefully, they are helpful in developing your own ideas for how to combine them in whatever project you are working on, even if its details are completely different.
A friend and fellow professor, Dr. Eve Pinkser, asked me to give a guest lecture on quantitative text analysis techniques within data science for her Public Health Policy Research Methods class with the University of Illinois at Chicago on April 13th, 2020. Multiple people have asked me similar questions about how to use data science to analyze texts quantitatively, so I figured I would post my presentation for anyone interested in learning more.
It provides a basic introduction of the different approaches so that you can determine which to explore in more detail. I have found that many people who are new to data science feel paralyzed when trying to navigate through the vast array of data science techniques out there and unsure where to start.
Many of her students needed to conduct quantitative textual analysis as part of their doctoral work but struggled in determining what type of quantitative research to employ. She asked me to come in and explain the various data science and machine learning-based textual analysis techniques, since this was out of her area of expertise. The goal of the presentation was to help the PhD students in the class think through the types of data science quantitative text analysis techniques that would be helpful for their doctoral research projects.
Hopefully, it would likewise allow you to determine the type or types of text analysis you might need so that you can then look those up in more detail. Textual analysis, as well as the wider field of natural language processing within which it is a part of, is a quickly up-and-coming subfield within data science doing important and groundbreaking work.
This is a follow-up to my previous article, “What Is Ethnography,” outlining ways ethnography is useful in professional settings.
To recap, I defined ethnography as a research approach that seeks “to understand the lived experiences of a particular culture, setting, group, or other context by some combination of being with those in that context (also called participant-observation), interviewing or talking with them, and analyzing what is produced in that context.”
Ethnography is a powerful tool, developed by anthropologists and other social scientists over the course of several decades. Here are three types of situations in professional settings when I have found to use ethnography to be especially powerful:
1. To see the given product and/or people in action
2. When brainstorming about a design
3. To understand how people navigate complex, patchwork processes
Situation
#1: To See the Given Product and/or People in Action
Ethnography allows you to witness people in action: using your product or service, engaging in the type of activity you are interested, or in whatever other situation you are interested in studying.
Many other social science research methods involve creating an artificial environment in which to observe how participants act or think in. Focus groups, for example, involve assembling potential customers or users into a room: forming a synthetic space to discuss the product or service in question, and in many experimental settings, researchers create a simulated environment to control for and analyze the variables or factors they are interested in.
Ethnography, on the other hand, centers around observing and understanding how people navigate real-world settings. Through it, you can get a sense for how people conduct the activity for which you are designing a product or service and/or how people actually use your product or service.
For example, if you want to understand how people use GPS apps to get around, one can see how people use the app “in the wild:” when rushing through heavy traffic to get to a meeting or while lost in the middle of who knows where. Instead of hearing their processed thoughts in a focus group setting or trying to simulate the environment, you can witness what the tumultuousness yourself and develop a sense for how to build a product that helps people in those exact situations.
Situation
#2: When Brainstorming about a New Product Design
Ethnography is especially useful during the early stages of designing a product or service, or during a major redesign. Ethnography helps you scope out the needs of your potential customers and how they approach meeting said needs. Thus, it helps you determine how to build a product or service that addresses those needs in a way that would make sense for your users.
During such initial stages of product design, ethnography helps determine the questions you should be asking. Many have a tendency during these initial stages to construct designs based on their own perception of people’s needs and desires and miss what the customers’ or users’ do in fact need and desire. Through ethnography, you ground your strategy in the customers’ mindsets and experiences themselves.
The brainstorming stages of product development also require a lot of flexibility and adaptability: As one determines what the product or service should become, one must be open to multiple potential avenues. Ethnography is a powerful tool for navigating such ambiguity. It centers you on the users, their experiences and mindsets, and the context which they might use the product or service, providing tools to ask open-ended questions and to generate new and helpful ideas for what to build.
Situation
#3: To Understand How People Navigate Complex, Patchwork Processes
At a past company, I analyzed how customer service representatives regularly used the various software systems when talking with customers. Over the years, the company had designed and bought various software programs, each to perform a set of functions and with unique abilities, limitations, and quirks. Overtime, this created a complex web of interlocking apps, databases, and interfaces, which customer service representatives had to navigate when performing their job of monitoring customer’s accounts. Other employees described the whole scene as the “Wild West:” each customer service representative had to create their own way to use these software systems while on the phone with a (in many cases disgruntled) customer.
Many companies end up building such patchwork systems – whether of software, of departments or teams, of physical infrastructure, or something else entirely – built by stacking several iterations of development overtime until, they become a hydra of complexity that employees must figure out how to navigate to get their work done.
Ethnography is a powerful tool for making sense of such processes. Instead of relying on official policies for how to conduct various actions and procedures, ethnography helps you understand and make sense of the unofficial and informal strategies people use to do what they need. Through this, you can get a sense for how the patchwork system really works. This is necessary for developing ways to improve or build open such patchwork processes.
In the customer service research project, my task was
to develop strategies to improve the technology customer service representatives
used as they talked with customers. Seeing how representatives used the
software through ethnographic research helped me understand and focus the analysis
on their day-to-day needs and struggles.
Conclusion
Ethnography is a powerful tool, and the business world and other professional settings have been increasingly realizing this (c.f. this and this ). I have provided three circumstances where I have personally found ethnography to be invaluable. Ethnography allows you to experience what is happening on the ground and through that to shape and inform the research questions we ask and recommendations or products we build for people in those contexts.
“Data science is doable,” a fellow attendee of the EPIC’s 2018 conference in Honolulu would exclaim like a mantra. The conference was for business ethnographers and UX researchers interested in understanding and integrating data science and machine learning into their research. She was specifically trying to address a tendency she has noticed– which I have seen as well: qualitative researchers and other so-called “non-math people” frequently believe that data science is far too technical for them. This seems ultimately rooted in cultural myths about math and math-related fields like computer science, engineering, and now data science, and in a similar vein as her statement, my goal in this essay is to discuss these attitudes and show that data science, like math, is relatable and doable if you treat it as such.
The “Math
Person”
In the United States, many possess an implied image of a “math person:” a person supposedly naturally gifted at mathematics. And many who do not see themselves as fitting that image simply decry that math simply isn’t for them. The idea that some people are inherently able and unable to do math is false, however, and prevents people from trying to become good at the discipline, even if they might enjoy and/or excel at it.
Most skills in life, including mathematical skills, are like muscles: you do not innately possess or lack that skill, but rather your skill develops as you practice and refine that activity. Anybody can develop a skill if they practice it enough.
Scholars in anthropology, sociology, psychology, and education have documented how math is implicitly and explicitly portrayed as something some people can do and some cannot do, especially in math classes in grade school. Starting in early childhood, we implicitly and sometimes explicitly learn the idea that some people are naturally gifted at math but for others, math is simply not their thing. Some internalize that they are gifted at math and thus take the time to practice enough to develop and refine their mathematical skills; while others internalize that they cannot do math and thus their mathematical abilities become stagnant. But this is simply not true.
Anyone can learn and do math if he or she practices math and cultivates mathematical thinking. If you do not cultivate your math muscle, then well it will become underdeveloped and, then, yes, math becomes harder to do. Thus, as a cruel irony someone internalizing that he or she cannot do math can turn into a self-fulfilling prophecy: he or she gives up on developing mathematical skills, which leads to its further underdevelopment.
Similarly, we cultivate another false myth that people skilled in mathematics (or math-related fields like computer science, engineering, and data science) in general do not possess strong social and interpersonal communication skills. The root for this stereotype lies in how we think of mathematical and logical thinking than actual characteristics of mathematicians, computer scientists, or engineers. Social scientists who have studied the social skills of mathematicians, computer scientists, and engineers have found no discernable difference in social and interpersonal communication skills with the rest of the world.
Quantitative and Qualitative Specialties
Anyone can learn and do math if he or she practices math and cultivates mathematical thinking.
The belief that some people are just inherently good at math and that such people do not possess strong social and interpersonal communication skills contributes to the division between quantitative and qualitative social research, in both academic and professional contexts. These attitudes help cultivate the false idea that quantitative research and qualitative research are distinct skill sets for different types of people: that supposedly quantitative research can only be done “math people” and qualitative research by “people people.” They suddenly become separate specialties, even though social research by its very nature involves both. Such a split unnecessarily stifles authentic and holistic understanding of people and society.
In professional and business research contexts, both qualitative and quantitative researchers should work with each other and eventually through that process, slowly learn each other’s skills. If done well, this would incentivize researchers to cultivate both mathematical/quantitative, and interpersonal/qualitative research skills.
It would reward professional researchers who develop both skillsets and leverage them in their research, instead of encouraging researchers to specialize in one or the other. It could also encourage universities to require in-depth training of both to train their students to become future workers, instead of requiring that students choose among disciplines that promote one track over the other.
Working together is
only the first step, however, whose success hinges on whether it ultimately
leads to the integration of these supposedly separate skillsets. Frequently,
when qualitative and quantitative research teams work together, they work mostly
independently – qualitative researchers on the qualitative aspect of the
project and quantitative researchers on the quantitative aspects of the project
– thus reinforcing the supposed distinction between them. Instead, such
collaboration should involve qualitative researchers developing quantitative
research skills by practicing such methods and quantitative researchers similarly
developing qualitative skills.
Conclusion
Anyone can develop mathematics and data science skills if they practice at it. The same goes with the interpersonal skills necessary for ethnographic and other qualitative research. Depicting them as separate specialties – even if they come together to do each of their specialized parts in a single research projects – functions stifles their integration as a singular set of tools for an individual and reinforces the false myths we have been teaching ourselves that data science is for math, programming, or engineering people and that ethnography is for “people people.” This separation stifles holistic and authentic social research, which inevitably involves qualitative and quantitative approaches.
What is ethnography, and how has it been used in the professional world? This article is a quick and dirty crash course for someone who has never heard of (or knows little about) ethnography.
Anthropology
at its most basic is the study of human cultures and societies. Cultural anthropologists generally seek
to understand current cultures and societies by conducting ethnography.
In short, ethnography involves seeking to understand the lived experiences of a particular culture, setting, group, or other context by some combination of being with those in that context (called participant-observation), interviewing or talking with them, and analyzing what happens and what is produced in that context.
It is an umbrella term for a set of methods (including participant-observation, interviews, group interviews or focus groups, digital recording, etc.) employed with that goal, and most ethnographic projects use some subset of these methods given the needs of the specific project. In this sense, it is similar to other umbrella methodologies – like statistics – in that it encapsulates a wide array of different techniques depending on the context.
One conducts ethnographic research to understand something about the lived experiences of a context. In the professional world, for example, ethnography is frequently useful in the following contexts:
Market Research: When trying to understand customers and/or users in-depth
Product Design: When trying to design or modify a product by seeing how people use it in action
Organizational Communication and Development: When trying to understand a “people problem” within an organization.
In this article, I expound in more detail on situations where ethnographic research is useful in in professional settings.
Ethnographies are best understood through examples, so the table below include excellent example ethnographies and ethnographic researchers in various industries/fields:
These, of course, are not the only some situations where ethnography might be helpful. Ethnography is a powerful tool to develop a deep understanding of others’ experiences and to develop innovative and strategic insights.
I am pleased to announce that the Annals of Anthropological Practice has accepted my article “Anthropology by Data Science.” https://anthrosource.onlinelibrary.wiley.com/doi/10.1111/napa.12169. In it, I reflect on the relationship anthropologist have cultivated with data science as a discipline and the importance of integrating machine learning techniques into ethnographic practice.
This is a quick and dirty summary of my master’s practicum research project with Indicia Consulting over the summer of 2018. For anyone interested in more detail, here is a more detailed report, and here is the final report with Indicia.
Background
My practicum was the sixth stage of a several year-long research project. The California Energy Commission commissioned this larger project to understand the potential relationship between individual energy consumption and technology usage. In stages one through five, we isolated certain clusters of behavior and attitudes around new technology adoption – which Indicia called cybersensitivity – and demonstrated that cybersensitivity tended to associate with a willingness to adopt energy-saving technology like smart meters.
This led to a key question: How can one identify cybersensivity among a broader population such as a community, county, or state? Answering this question was the main goal of my practicum project.
In the past stages of the research project, the team used ethnographic research to establish criteria for whether someone was a cybersensitive based on several hours of interviews and observations about their technology usage. These interviews and observations certainly helped the research team analyze behavioral and attitudinal patterns, determine what patterns were significant, and develop those into the concept of cybersensitivity, but they are too time- and resource-intensive to perform with an entire population. One generally does not have the ability to interview everyone in a community, county, or state. I sought to address this directly in my project.
Task
Timeline
Task Name
Research Technique
Description
Task 1
June 2015-Sept 2018
General Project Tasks
Administrative (N/A)
Developed project scope and timeline, adjusting as the project unfolds
Task 2
July 2015 – July 2016
Documenting and analyzing emerging attitudes, emotions, experiences, habits, and practices around technology adoption
Survey
Conducted survey research to observe patterns of attitudes and behaviors among cybersensitives/awares.
Task 3
Sept 2016 – Dec 2016
Identifying the attributes and characteristics and psychological drivers of cybersensitives
Interviews and Participant-Observation
Conducted in-depth interviews and observations coding for psych factor, energy consumption attitudes and behaviors, and technological device purchasing/usage.
Task 4*
Sept 2016 – July 2017
Assessing cybersensitives’ valence with technology
Statistical Analysis
Tested for statistically significant differences in demographics, behaviors, and beliefs/attitudes between cyber status groups
Task 5
Aug 2017 – Dec 2018
Developing critical insights for supporting residential engagement in energy efficient behaviors
Statistical Analysis
Analyzed utility data patterns of study participants, comparing it with the general population.
Task 6
March 2018 – Aug 2018
Recommending an alternative energy efficiency potential model
Decision Tree Modeling
Constructed decision tree models to classify an individual’s cyber status
Project Goal
The overall goal for the project was to produce a scalable method to assess whether someone exhibits cybersensitivity based on data measurable across an entire population. In doing this, the project also helped address the following research needs:
Created a method to further to scale across a larger population, assessing whether cybersensitives were more willing to adopt energy saving technologies across a community, county, or state
Provided the infrastructure to determine how much promoting energy-saving campaigns targeting cybersensitives specifically would reduce energy consumption in California
Helped the California Energy Commission determine the best means to reach cybersensitives for specific energy-saving campaigns
The Project
I used machine learning modeling to create a decision-making flow to isolate cybersensitives in a population. Random forests and decision trees produced the best models for Indicia’s needs: random forests in accuracy and robustness and decision trees in human decipherability. Through them, I created a programmable yet human-comprehensible framework to determine whether an individual is cybersensitive based on behaviors and other characteristics that an organization could be easily assess within a whole population. Thus, any energy organization could easily understand, replicate, and further develop the model since it was both easy for humans to read and encodable computationally. This way organizations could both use and refine it for their purposes.
Conclusion
This is a quick overview of my master’s practicum project. For more details on what modeling I did, how I did it, what results it produced, and how it fit within the wider needs of the multi-year research project, please see my full report.
I really appreciated the opportunity it posed to get my hands dirty integrating ethnography and data science to help address a real-world problem. This summary only scratches the surface of what Indicia did with the Californian Energy Commission to encourage sustainable energy usage societally. Hopefully, though, it will inspire you to integrate ethnography and data science to address whatever complex questions you face. It certainly did for me.
Thank you to Susan Mazur-Stommen and Haley Gilbert for your help in organizing and completing the project. I would like to thank my professorial committee at the University of Memphis – Dr. Keri Brondo, Dr. Ted Maclin, Dr. Deepak Venugopal, and Dr. Katherine Hicks – for their academic support as well.
In the spring of 2018, I researched how anthropologists and related social scholars have analyzed data science and machine learning for my Master’s in Anthropology at the University of Memphis. For the project, I assessed the anthropological literature on data science and machine learning to date and explore potential connections between anthropology and data science, based on my perspective as a data scientist and anthropologist. Here is my final report.
Thank you, Dr. Ted Maclin, for your help overseeing and assisting this project.