Anthropology by Data Science Archives

Designing Machine Learning Products Anthropologically: Building Relatable Machine Learning

Loading…

Taking too long?

Reload document

Open in new tab

Download [2.38 MB]

How do we build relatable machine learning models that regular people can understand? This is a presentation about how design principles apply to the development of machine learning systems. Too often in data science, machine learning software is not built with regular people who will interact with it in mind.

I argue that in order to make machine learning software relatable, we need to use design thinking to intentionally build in mechanisms for users to form their own mental models of how the machine learning software works. Failing to include theses helps cultivate the common sense that machine learning is a black box for users.

I gave three different versions of this talk at Quant UX Con on June 8^th, 2022, the Royal Institute of Anthropology’s annual conference on June 10^th, 2022, and Google’s AI + Design Tooling Research Symposium on August 5^th, 2022.

I hope you find it interesting and feel free to share any thoughts you might have.

Thank you for the conference and talk organizers for making this happen, and I appreciate all the insightful conversations I had about the role of design thinking in building relatable machine learning.

Trash Data Science: Garbology, Anthropology, and Spatial Data Science – Conversation with Gideon Singer (Part Four)

Here is the fourth and final part of my interview with Gideon Singer, Director of Spacial Data Science at Litterati, for my Interview Series. He describes the strategies he uses to collect data as a garbologist and data scientist.

Here is Part 1, Part 2, and Part 3 of our interview.

Gideon Singer is an applied anthropologist in the business of exploring societies through the waste, litter, rubbish, and other detritus they leave behind. As a self-proclaimed digital garbologist, his work juxtaposes digital ethnography with archaeology and spatial data science.

Resources:

Trash Data Science: Garbology, Anthropology, and Spatial Data Science – Conversation with Gideon Singer (Part Three)

Here is the third part of my interview with Gideon Singer, Director of Spacial Data Science at Litterati, for my Interview Series. He discusses how the interconnections he has found between data science and garbology.

Here is Part 1, Part 2, and Part 4 of our interview.

Resources:

Trash Data Science: Garbology, Anthropology, and Spatial Data Science – Conversation with Gideon Singer (Part Two)

Here is the second part of my interview with Gideon Singer, Director of Spacial Data Science at Litterati, for my Interview Series. He describes garbology is and what kind of work he does as a data scientist garbologist.

Here is Part 1, Part 3, and Part 4 of our interview.

Resources:

Trash Data Science: Garbology, Anthropology, and Spatial Data Science – Conversation with Gideon Singer (Part One)

I interviewed Gideon Singer, Director of Spacial Data Science at Litterati, for my Interview Series. He discusses his mission to combine garbology, anthropology, and data science to better understand humanity and the trash we leave behind. In this first part, he describes the connections he has found between these various fields.

Here is Part 2, Part 3, and Part 4 of our interview.

Resources:

Data Scientist, Anthropologist, and Entrepreneur: Interview with Schaun Wheeler (Interview #2 in the Interview Series)

For my second interview in the Interview Series, I interviewed Schaun Wheeler. Schaun is co-founder of Aampe, a startup that embeds an active learning system into mobile apps to turn push notifications into part of the app’s user interface. Before he co-founded Aampe, Schaun was the data science lead for the award-winning Consumer Graph intelligence product at Valassis, a U.S. ad-tech firm. And before that he founded and directed the data science team at Success Academy Charter Schools in New York City. Then before that, Schaun was one of the first people to champion the use of statistical inference to understand massive unstructured data at the United States Department of the Army. Schaun has a Ph.D. in Cultural Anthropology from the University of Connecticut.

If the audio does not play on your computer, you can download it here:

Schaun-Interview-Audio Download

Over our conversation, we discussed the following:

Schaun’s experiences as both a data scientist and anthropologist
His utilization of anthropology within data science to decipher the right problem before launching into data science solutions
Recommendations for how anthropologists can develop data science and programming skills
His experiences starting a new data science consumer and market-research based company

To learn more about Schaun Wheeler and Aampe, check these out:

LinkedIn (the best way to contact him): https://www.linkedin.com/in/schaunwheeler/

Medium: https://medium.com/@schaun.wheeler

Twitter: https://twitter.com/schaunw

Aampe website: https://www.aampe.com/

Aampe blog: https://www.aampe.com/blog

A User Story, The Data Science Children’s Book: https://www.aampe.com/blog/a-user-story

More Detailed Walkthrough: Clip #1: https://www.youtube.com/playlist?list=PL03WDMCL2PHjRd8Y8USzvVkcIyQM57FMU and Clip #2: https://youtu.be/kwk_Ot8orPY

Previous Interview in the Interview Series: https://ethno-data.com/astrid-interview-1/

EPIC Data Scientists + Ethnographers Group

I recently organized a professional group called EPIC Data Scientists + Ethnographers along with a few others who are both data scientists and ethnographers. Our goal is to form a virtual community to discuss ways to incorporate ethnography and data science, just like I strive to do on this website.

If you are interested in working with others on this or simply interested in learning more, feel free to join. Whether you are both a data scientist and ethnographer, only one of them, or neither, we would love to hear your perspective.

Thank you, EPIC, for helping to develop this and giving us a platform.

Photo credit: deepak pal at https://www.flickr.com/photos/158301585@N08/46085930481/

Resources on Integrating Data Science and Ethnography

Here is a list of resources about integrating data science and ethnography. Even though it is an up and coming field without a consistent list of publications, several fascinating and insightful resources do exist.

If there are any resources about integrating data science and ethnography that you have found useful, feel free to share them as well.

General Overviews:

Curran, John. “Big Data or ‘Big Ethnographic Data’? Positioning Big Data within the Ethnographic Space.” EPIC (2013). (Found here: https://www.epicpeople.org/big-data-or-big-ethnographic-data-positioning-big-data-within-the-ethnographic-space/)
Patel, Neal. “For a Ruthless Criticism of Everything Existing: Rebellion Against the Quantitative-Qualitative Divide.” EPIC (2013): 43-60.
Nick Seaver. “Bastard Algebra.” Boellstorff, Tom and Bill Maurer. Data, Now Bigger and Better. Chicago: Prickly Paradigm Press, 2015. 27-46.
Slobin, Adrian and Todd Cherkasky. “Ethnography in the Age of Analytics.” EPIC (2010).
Nafus, Dawn and Tye Rattenbury. Data Science and Ethnography: What’s Our Common Ground, and Why Does It Matter? 7 3 2018. <https://www.epicpeople.org/data-science-and-ethnography/>.
Nick Seaver. “The nice thing about context is that everyone has it.” Media, Culture & Society (2015).

Books:

Nafus, Dawn and Hannah Knox. Ethnography for a Data-Saturated World. Manchester: Manchester Univeristy Press, 2018.
Boellstorff, Tom and Bill Maurer. Data, Now Bigger and Better! Chicago: Prickly Paradigm Press, 2015.
Mackenzie, Adrian. Machine Learners: Archaeology of a Data Practice. Cambridge: The MIT Press, 2017.

Examples and Case Studies:

“Autonomous Drive: Teaching Cars Human Behaviour” by Melissa Cefkin on the Youtube Channel DrivingTheNation: https://www.youtube.com/watch?v=6koKuDegHAM
Eslami, Motahhare, et al. “First I “like” it, then I hide it: Folk Theories of Social Feeds.” Curation and Algorithms (2016).
Giaccardi, Elisa, Chris Speed and Neil Rubens. “Things Making Things: An Ethnography of the Impossible.” (2014).

Elish, M. “The Stakes of Uncertainty: Developing and Integrating Machine Learning in Clinical Care.” EPIC (2018).
Madsen, Matte My, Anders Blok and Morten Axel Pedersen. “Transversal collaboration: an ethnography in/of computational social science.” Nafus, Dawn. Ethnography for a Data-saturated World. Manchester: Manchester Univeristy Press, 2018.
Thomas, Suzanne, Dawn Nafus and Jamie Sherman. “Algorithms as fetish: Faith and possibility in algorithmic work.” Big Data & Society (2018): 1-11.

Articles and Blog Posts:

“An Engineering Anthropologist: Why tech companies need to hire software developers with ethnographic skills” by Astrid Countee: http://ethnographymatters.net/blog/2016/06/22/an-engineering-anthropologist-why-tech-companies-need-to-hire-software-developers-with-ethnographic-skills/
“Cross-disciplinary Insights Teams: Integrating Data Scientists and User Researchers at Spotify” by Sara Belt and Peter Gilks: https://www.epicpeople.org/cross-disciplinary-insights-teams-integrating-data-scientists-and-user-researchers-at-spotify/
“Data is a stakeholder” by Schaun Wheeler: https://towardsdatascience.com/data-is-a-stakeholder-31bfdb650af0
“Why Big Data Needs Thick Data” by Tricia Wang: https://medium.com/ethnography-matters/why-big-data-needs-thick-data-b4b3e75e3d7

My Own Articles on This Website:

Podcasts and Lectures:

“Computational Anthropology: Quali-quantitative Analyses of Attention Economies during the Covid-19 Lockdown” by Morten Axel Pedersen: https://www.material.city/recordings/mortenaxelpedersen
“Human-Driven Machine Learning with Saleema Amershi”: https://datastori.es/115-human-driven-machine-learning-with-saleema-amershi/#t=29:00.204
“Welcome to Dataworld, by Alexander Taylor”: https://player.fm/series/camthropod/episode-13-welcome-to-dataworld-by-alex-taylor
“Machine Learning for Artists with Gene Kogan”: https://datastori.es/114-machine-learning-for-artists-with-gene-kogan/#t=34:28.738

Ethical Considerations:

“Caroline Sinders on Ethical Product Design for Machine Learning”: https://design.blog/2017/03/23/caroline-sinders-on-ethical-product-design-for-machine-learning/
“The Trouble with Bias” by Kate Crawford: https://www.youtube.com/watch?v=fMym_BKWQzk
“Justice for ‘Data Janitors’” by Lilly Irani: http://www.publicbooks.org/justice-for-data-janitors/
Elish, Madeleine. “Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction.” Engaging Science, Technology, and Society (2019).
boyd, danah and Kate Crawford. “Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon.” Information, Communication, & Society (2012): 662-679.

Four Innovative Projects that Integrated Data Science and Ethnography

In a previous article, I have discussed the value of integrating data science and ethnography. On LinkedIn, people commented that they were interested and wanted to hear more detail on potential ways to do this. I replied, “I have found explaining how to conduct studies that integrate the two practically is easier to demonstrate through example than abstractly since the details of how to do it vary based on the specific needs of each project.”

In this article, I intend to do exactly that: analyze four innovative projects that in some way integrated data science and ethnography. I hope these will spur your creative juices to help think through how to creatively combine them for whatever project you are working on.

Synopsis:

Project:	How It Integrated Data Science and Ethnography:	Link to Learn More:
No Show Model	Used ethnography to design machine learning software	https://ethno-data.com/show-rate-predictor/
Cybersensitivity Study	Used machine learning to scale up the scope of an ethnographic inquiry to a larger population	https://ethno-data.com/masters-practicum-summary/
Facebook Newsfeed Folk Theories	Used ethnography to understand how users make sense of and behave towards a machine learning system they encounter and how this, in turn, shapes the development of the machine learning algorithm(s)	https://dl.acm.org/doi/10.1145/2858036.2858494
Thing Ethnography	Used machine learning to incorporate objects’ interactions into ethnographic research	https://dl.acm.org/doi/10.1145/2901790.2901905 and https://www.semanticscholar.org/paper/Things-Making-Things%3A-An-Ethnography-of-the-Giaccardi-Speed/2db5feac9cc743767fd23aeded3aa555ec8683a4?p2df

Project 1: No Show Model

A medical clinic at a hospital system in New York City asked me to use machine learning to build a show rate predictor in order to inform an improve its scheduling practices. During the initial construction phase, I used ethnography to both understand in more depth understand the scheduling problem the clinic faced and determine an appropriate interface design.

Through an ethnographic inquiry, I discovered the most important question(s) schedulers ask when scheduling their appointments. This was, “Of the people scheduled for a given doctor on a particular day, how many of them are likely to actually show up?” I then built a machine learning model to answer this exact question. My ethnographic inquiry provided me the design requirements for the data science project.

In addition, I used my ethnographic inquiries to design the interface. I observed how schedulers interacted with their current scheduling software, which gave me a sense for what kind of visualizations would work or not work for my app.

This project exemplifies how ethnography can be helpful both in the development stage of a machine learning project to determine machine learning algorithm(s) needs and on the frontend when communicating the algorithm(s) to and assessing its successfulness with its users.

As both an ethnographer and a data scientist, I was able to translate my ethnographic insights seamlessly into machine learning modeling and API specifications and also conducted follow-up ethnographic inquiries to ensure that what I was building would meet their needs.

Project 2: Cybersensitivity Study

I conducted this project with Indicia Consulting. Its goal was to explore potential connections between individuals’ energy consumption and their relationship with new technology. This is an example of using ethnography to explore and determine potential social and cultural patterns in-depth with a few people and then using data science to analyze those patterns across a large population.

We started the project by observing and interviewing about thirty participants, but as the study progressed, we needed to develop a scalable method to analyze the patterns across whole communities, counties, and even states.

Ethnography is a great tool for exploring a phenomenon in-depth and for developing initial patterns, but it is resource-intensive and thus difficult to conduct on a large group of people. It is not practical for saying analyzing thousands of people. Data science, on the other hand, can easily test the validity across an entire population of patterns noticed in smaller ethnographic studies, yet because it often lacks the granularity of ethnography, would often miss intricate patterns.

Ethnography is also great on the back end for determining whether the implemented machine learning models and their resulting insights make sense on the ground. This forms a type of iterative feedback loop, where data science scales up ethnographic insights and ethnography contextualizes data science models.

Thus, ethnography and data science cover each other’s weaknesses well, forming a great methodological duo for projects centered around trying to understand customers, users, colleagues, or other users in-depth.

Project 3: Facebook Newsfeed Folk Theories

In their study, Motahhare Eslami and her team of researchers conducted an ethnographic inquiry into how various Facebook users conceived of how the Facebook Newsfeed selects which posts/stories rise to the top of their feeds. They analyze several different “folk theories” or working theories by everyday people for the criteria this machine learning system uses to select top stories.

How users think the overall system works influences how they respond to the newsfeed. Users who believe, for example, that the algorithm will prioritize the posts of friends for whom they have liked in the past will often intentionally like the posts of their closest friends and family so that they can see more of their posts.

Users’ perspectives on how the Newsfeed algorithm works influences how they respond to it, which, in turn, affects the very data the algorithm learns from and thus how the algorithm develops. This creates a cyclic feedback loop that influences the development of the machine learning algorithmic systems over time.

Their research exemplifies the importance of understanding how people think about, respond to, and more broadly relate with machine learning-based software systems. Ethnographies into people’s interactions with such systems is a crucial way to develop this understanding.

In a way, many machine learning algorithms are very social in nature: they – or at least the overall software system in which they exist – often succeed or fail based on how humans interact with them. In such cases, no matter how technically robust a machine learning algorithm is, if potential users cannot positively and productively relate to it, then it will fail.

Ethnographies into the “social life” of machine learning software systems (by which I mean how they become a part of – or in some cases fail to become a part of – individuals’ lives) helps understand how the algorithm is developing or learning and determine whether they are successful in what we intended them to do. Such ethnographies require not only in-depth expertise in ethnographic methodology but also an in-depth understanding how machine learning algorithms work to in turn understand how social behavior might be influencing their internal development.

Project 4: Thing Ethnography

Elise Giaccardi and her research team have been pioneering the utilization of data science and machine learning to understand and incorporate the perspective of things into ethnographies. With the development of the internet of things (IOT), she suggests that the data from object sensors could provide fresh insights in ethnographies of how humans relate to their environment by helping to describe how these objects relate to each other. She calls this thing ethnography.

This experimental approach exemplifies one way to use machine learning algorithms within ethnographies as social processes/interactions in of themselves. This could be an innovative way to analyze the social role of these IOT objects in daily life within ethnographic studies. If Eslami’s work exemplifies a way to graft ethnographic analysis into the design cycle of machine learning algorithms, Giaccardi’s research illustrates one way to incorporate data science and machine learning analysis into ethnographies.

Conclusion

Here are four examples of innovative projects that involve integrating data science and ethnography to meet their respective goals. I do not intend these to be the complete or exhaustive account of how to integrate these methodologies but as food for thought to spur further creative thinking into how to connect them.

For those who, when they hear the idea of integrating data science and ethnography, ask the reasonable question, “Interesting but what would that look like practically?”, here are four examples of how it could look. Hopefully, they are helpful in developing your own ideas for how to combine them in whatever project you are working on, even if its details are completely different.

Photo credit #1: StartupStockPhotos at https://pixabay.com/photos/startup-meeting-brainstorming-594090/

Photo credit #2: DarkoStojanovicat at https://pixabay.com/photos/medical-appointment-doctor-563427/

Photo credit #3: NASA at https://unsplash.com/photos/Q1p7bh3SHj8

Photo credit #4: Kon Karampelas at https://unsplash.com/photos/HUBofEFQ6CA

Photo credit #5: Pixabay at https://www.pexels.com/photo/app-business-connection-device-221185/

Recently Published Article: “Anthropology by Data Science”

tea set and newspaper placed on round table near comfortable chair — Photo by Ekrulila on Pexels.com

I am pleased to announce that the Annals of Anthropological Practice has accepted my article “Anthropology by Data Science.” https://anthrosource.onlinelibrary.wiley.com/doi/10.1111/napa.12169. In it, I reflect on the relationship anthropologist have cultivated with data science as a discipline and the importance of integrating machine learning techniques into ethnographic practice.

Annals of Anthropological Practice is overseen by the National Association for the Practice of Anthropology (NAPA) within the American Anthropological Association. Thank you, NAPA, for publishing my article and thank you to all the unnamed editors and reviewers in the process.

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Synopsis:

Project 1: No Show Model

Project 2: Cybersensitivity Study

Project 3: Facebook Newsfeed Folk Theories

Project 4: Thing Ethnography

Conclusion

Share this:

Share this: