Hello, my name is Stephen Paff. I am a data scientist and an ethnographer. The goal of this blog is to explore the integration of data science and ethnography as an exciting and innovative way to understand people, whether consumers, users, fellow employees, or anyone else.
I want to think publicly. Ideas worth having develop in
conversation, and through this blog, I hope to present my integrative vision so
that others can potentially use it to develop their own visions and in turn
help shape mine.
Please Note: Because my blog straddles two technical areas, I will split my posts based on how in-depth they go into each technical expertise. Many posts I will write for a general audience. I will write some posts, though, for data scientists discussing technical matters within that field, and other posts will focus on technical topics withn ethnography for anthropologists and other ethnographers. At the top of each post, I will provide the following disclaimers:
In a previous article, I have discussed the value of integrating data science and ethnography. On LinkedIn, people commented that they were interested and wanted to hear more detail on potential ways to do this. I replied, “I have found explaining how to conduct studies that integrate the two practically is easier to demonstrate through example than abstractly since the details of how to do it vary based on the specific needs of each project.”
In this article, I intend to do exactly that: analyze four innovative projects that in some way integrated data science and ethnography. I hope these will spur your creative juices to help think through how to creatively combine them for whatever project you are working on.
Synopsis:
Project:
How It Integrated Data Science and Ethnography:
Link to Learn More:
No Show Model
Used ethnography to design machine learning software
Used ethnography to understand how users make sense of and behave towards a machine learning system they encounter and how this, in turn, shapes the development of the machine learning algorithm(s)
A medical clinic at a hospital system in New York City asked me to use machine learning to build a show rate predictor in order to inform an improve its scheduling practices. During the initial construction phase, I used ethnography to both understand in more depth understand the scheduling problem the clinic faced and determine an appropriate interface design.
Through an ethnographic inquiry, I discovered the most important question(s) schedulers ask when scheduling their appointments. This was, “Of the people scheduled for a given doctor on a particular day, how many of them are likely to actually show up?” I then built a machine learning model to answer this exact question. My ethnographic inquiry provided me the design requirements for the data science project.
In addition, I used my ethnographic inquiries to design the interface. I observed how schedulers interacted with their current scheduling software, which gave me a sense for what kind of visualizations would work or not work for my app.
This project exemplifies how ethnography can be helpful both in the development stage of a machine learning project to determine machine learning algorithm(s) needs and on the frontend when communicating the algorithm(s) to and assessing its successfulness with its users.
As both an ethnographer and a data scientist, I was able to translate my ethnographic insights seamlessly into machine learning modeling and API specifications and also conducted follow-up ethnographic inquiries to ensure that what I was building would meet their needs.
Project 2: Cybersensitivity Study
I conducted this project with Indicia Consulting. Its goal was to explore potential connections between individuals’ energy consumption and their relationship with new technology. This is an example of using ethnography to explore and determine potential social and cultural patterns in-depth with a few people and then using data science to analyze those patterns across a large population.
We started the project by observing and interviewing about thirty participants, but as the study progressed, we needed to develop a scalable method to analyze the patterns across whole communities, counties, and even states.
Ethnography is a great tool for exploring a phenomenon in-depth and for developing initial patterns, but it is resource-intensive and thus difficult to conduct on a large group of people. It is not practical for saying analyzing thousands of people. Data science, on the other hand, can easily test the validity across an entire population of patterns noticed in smaller ethnographic studies, yet because it often lacks the granularity of ethnography, would often miss intricate patterns.
Ethnography is also great on the back end for determining whether the implemented machine learning models and their resulting insights make sense on the ground. This forms a type of iterative feedback loop, where data science scales up ethnographic insights and ethnography contextualizes data science models.
Thus, ethnography and data science cover each other’s weaknesses well, forming a great methodological duo for projects centered around trying to understand customers, users, colleagues, or other users in-depth.
Project 3: Facebook Newsfeed Folk Theories
In their study, Motahhare Eslami and her team of researchers conducted an ethnographic inquiry into how various Facebook users conceived of how the Facebook Newsfeed selects which posts/stories rise to the top of their feeds. They analyze several different “folk theories” or working theories by everyday people for the criteria this machine learning system uses to select top stories.
How users think the overall system works influences how they respond to the newsfeed. Users who believe, for example, that the algorithm will prioritize the posts of friends for whom they have liked in the past will often intentionally like the posts of their closest friends and family so that they can see more of their posts.
Users’ perspectives on how the Newsfeed algorithm works influences how they respond to it, which, in turn, affects the very data the algorithm learns from and thus how the algorithm develops. This creates a cyclic feedback loop that influences the development of the machine learning algorithmic systems over time.
Their research exemplifies the importance of understanding how people think about, respond to, and more broadly relate with machine learning-based software systems. Ethnographies into people’s interactions with such systems is a crucial way to develop this understanding.
In a way, many machine learning algorithms are very social in nature: they – or at least the overall software system in which they exist – often succeed or fail based on how humans interact with them. In such cases, no matter how technically robust a machine learning algorithm is, if potential users cannot positively and productively relate to it, then it will fail.
Ethnographies into the “social life” of machine learning software systems (by which I mean how they become a part of – or in some cases fail to become a part of – individuals’ lives) helps understand how the algorithm is developing or learning and determine whether they are successful in what we intended them to do. Such ethnographies require not only in-depth expertise in ethnographic methodology but also an in-depth understanding how machine learning algorithms work to in turn understand how social behavior might be influencing their internal development.
Project 4: Thing Ethnography
Elise Giaccardi and her research team have been pioneering the utilization of data science and machine learning to understand and incorporate the perspective of things into ethnographies. With the development of the internet of things (IOT), she suggests that the data from object sensors could provide fresh insights in ethnographies of how humans relate to their environment by helping to describe how these objects relate to each other. She calls this thing ethnography.
This experimental approach exemplifies one way to use machine learning algorithms within ethnographies as social processes/interactions in of themselves. This could be an innovative way to analyze the social role of these IOT objects in daily life within ethnographic studies. If Eslami’s work exemplifies a way to graft ethnographic analysis into the design cycle of machine learning algorithms, Giaccardi’s research illustrates one way to incorporate data science and machine learning analysis into ethnographies.
Conclusion
Here are four examples of innovative projects that involve integrating data science and ethnography to meet their respective goals. I do not intend these to be the complete or exhaustive account of how to integrate these methodologies but as food for thought to spur further creative thinking into how to connect them.
For those who, when they hear the idea of integrating data science and ethnography, ask the reasonable question, “Interesting but what would that look like practically?”, here are four examples of how it could look. Hopefully, they are helpful in developing your own ideas for how to combine them in whatever project you are working on, even if its details are completely different.
What is ethnography, and how has it been used in the professional world? This article is a quick and dirty crash course for someone who has never heard of (or knows little about) ethnography.
Anthropology
at its most basic is the study of human cultures and societies. Cultural anthropologists generally seek
to understand current cultures and societies by conducting ethnography.
In short, ethnography involves seeking to understand the lived experiences of a particular culture, setting, group, or other context by some combination of being with those in that context (called participant-observation), interviewing or talking with them, and analyzing what happens and what is produced in that context.
It is an umbrella term for a set of methods (including participant-observation, interviews, group interviews or focus groups, digital recording, etc.) employed with that goal, and most ethnographic projects use some subset of these methods given the needs of the specific project. In this sense, it is similar to other umbrella methodologies – like statistics – in that it encapsulates a wide array of different techniques depending on the context.
One conducts ethnographic research to understand something about the lived experiences of a context. In the professional world, for example, ethnography is frequently useful in the following contexts:
Market Research: When trying to understand customers and/or users in-depth
Product Design: When trying to design or modify a product by seeing how people use it in action
Organizational Communication and Development: When trying to understand a “people problem” within an organization.
In this article, I expound in more detail on situations where ethnographic research is useful in in professional settings.
Ethnographies are best understood through examples, so the table below include excellent example ethnographies and ethnographic researchers in various industries/fields:
These, of course, are not the only some situations where ethnography might be helpful. Ethnography is a powerful tool to develop a deep understanding of others’ experiences and to develop innovative and strategic insights.