Ethno-Data: Introduction to My Blog

            Hello, my name is Stephen Paff. I am a data scientist and an ethnographer. The goal of this blog is to explore the integration of data science and ethnography as an exciting and innovative way to understand people, whether consumers, users, fellow employees, or anyone else.

            I want to think publicly. Ideas worth having develop in conversation, and through this blog, I hope to present my integrative vision so that others can potentially use it to develop their own visions and in turn help shape mine.

Please Note: Because my blog straddles two technical areas, I will split my posts based on how in-depth they go into each technical expertise. Many posts I will write for a general audience. I will write some posts, though, for data scientists discussing technical matters within that field, and other posts will focus on technical topics withn ethnography for anthropologists and other ethnographers. At the top of each post, I will provide the following disclaimers:

Data Science Technical Level: None, Moderate, or Advanced
Ethnography Technical Level: None, Moderate, or Advanced

Integrating Ethnography and Data Science

As a data scientist and ethnographer, I have worked on many types of research projects. In professional and business settings, I am excited by the enormous growth in both data science and ethnography but have been frustrated by how, despite recent developments that make them more similar, their respective teams seem to be growing apart and competitively against each other.

Within academia, quantitative and qualitative research methods have developed historically as distinct and competing approaches as if one has to choose which direction to take when doing research: departments or individual researchers specialize in one or the other and fight over scarce research funding. One major justification for this division has been the perception that quantitative approaches tend to be prescriptive and top-down compared with qualitative approaches which tend to be to descriptive and bottom-up. That many professional research contexts have inherited this division is unfortunate.

Recent developments in data science draw parallels with qualitative research and if anything, could be a starting point for collaborative intermingling. What has developed as “traditional” statistics taught in introductory statistics courses is generally top-down, assuming that data follows a prescribed, ideal model and asking regimented questions based on that ideal model. Within the development of machine learning been a shift towards models uniquely tailored to the data and context in question, developed and refined iteratively.[i] These trends may show signs of breaking down the top-down nature of traditional statistics work.

If there was ever a time to integrate quantitative data science and qualitative ethnographic research, it is now. In the increasingly important “data economy,” understanding users/consumers is vital to developing strategic business practices. In the business world, both socially-oriented data scientists and ethnographers are experts in understanding users/consumers, but separating them into competing groups only prevents true synthesis of their insights. Integrating the two should not just include combining the respective research teams and their projects but also encouraging researchers to develop expertise in both instead of simply specializing in one or the other. New creative energy could burst forth when we no longer treat these as distinct methodologies or specialties.


[i] Nafus, D., & Knox, H. (2018). Ethnography for a Data-Saturated World. Manchester: Manchester University Press, 11-12.

Photo credit #1: Frank V at  https://unsplash.com/photos/IFLgWYlT2fI

Photo credit #2: Arif Wahid at https://unsplash.com/photos/y3FkHW1cyBE

Rethinking Ethnography in Anthropology

This is a follow-up on my previous article about the difference between anthropology and ethnography. In this article, I discuss recent trends within anthropology to either revitalize ethnography and/or rethink its status as the primary research methodology within the discipline.

We anthropologists should consider expanding beyond the ethnographic toolkit. That could involve redefining what it means to conduct ethnography in such a way that includes other types of practices outside of the traditional ethnographic toolkit and/or rethinking the role of ethnography as our primary methodology.

For context, ethnography has been the primary tool within the discipline for the last several decades. I would define ethnography as a methodological approach that seeks to holistically understand and express the lived experiences of those in a particular sociocultural context(s) (see this article and this paper). Ethnography conventionally entails a specific set of qualitative methodologies that help to understand and analyze these lived experiences, including participant observation, interviews, qualitative coding, and so on. Anthropologists and other ethnographers have built this set of practices because they are excellent at capturing people’s lived experiences, and I agree that they are powerful for that.

I do not, however, believe that these are the only potential ways to do that. For me, ethnography is an orientation, an approach that seeks to make sense of the social world by focusing on the lived experiences of others, not necessarily some collection of qualitative methods. Seeing ethnography as an orientation, for example, would enable ethnographers to use data science and machine learning tools within ethnographies (see this and this).

My perspective here exemplifies the first way some anthropologists have sought to expand beyond the traditional ethnographic toolkit: by redefining ethnography. For us, viewing ethnography as a specific set of qualitative research techniques pigeonholes what ethnography can be. Although these techniques are powerful and useful, their exclusive deployment within anthropology stifles what ethnography can become.

Other anthropologists will seek to expand beyond this toolkit by advocating for non-ethnographic anthropological research. For them, anthropologists should cultivate other research practices in addition to or sometimes instead of ethnography. I am passionate about applying this specifically to data science and machine learning, and Morten Axel Pedersen is a counterpart to me who in this specific area. He thinks anthropologists should move beyond ethnographic research, which could include incorporating data science and machine learning research (see his talk as an example). Similar to me, he wants to see more utilization of data science and machine learning within anthropology, but he presents this as an alternative to doing ethnographic research not as a potential part of ethnographic research like I do.

The difference between the two approaches is subtle: the first advocates for reimaging ethnography and the second for reimagining anthropology and anthropological research while potentially keeping ethnography the same. On a practical level, though, they are not that different. Not only are they not mutually exclusive: one can seek to redefine ethnography and ethnography’s hold within anthropology. But they each also have their place in seeking when encouraging the expansion of the anthropological toolkit. In some situations, the promotion of redefining ethnography beyond its traditional qualitative practices is most beneficial, and other times, advocating for non-ethnographic forms of research would be.

Photo credit #1: StockSnap at https://pixabay.com/photos/people-girls-women-students-2557396/

Photo credit #2: hosny_salah at https://pixabay.com/photos/woman-hijab-worker-factory-worker-5893942/

Photo credit #3: Jack Douglass at https://unsplash.com/photos/ouZAz-3vh7I

What Is the Difference between Anthropology and Ethnography?

(Feel free to check out my follow-up article to this one about rethinking the role of ethnography in anthropology as well.)

A friend recently asked me, “What’s the difference between anthropology and ethnography?” When I tell them I am an anthropologist, people have asked me this question – phrased in slightly different ways – enough times that I am writing this article to answer it for anyone who might be wondering what the difference is.

To situate his question, he explained how other anthropologists he had worked with would often contrast anthropological work with mere ethnography, but that he never understood the difference. That has generally been the experience of people I have talked to who have asked me this question: they have recently encountered anthropologists contrasting their work with other ethnographers, something which left them puzzled given how connected anthropology and ethnography has been in their experience.

Ethnography is anthropology’s “methodological baby,” and in my experience, the anthropology vs ethnography conversation is typically a way for anthropologists to process others’ increasing utilization of ethnography.  Thus, to those looking in from the outside like my friend, this discussion within anthropology about the differences can seem perplexing.

The Short-Answer

book page

The short answer is that anthropology is a discipline while ethnography is a methodology. Anthropology refers to the study of human cultures and humanity in general. Ethnography is a methodological approach to learning about a culture, setting, group, or other context by observing it yourself and/or piecing together the experiences of those there (this article provides an in-depth definition of ethnography).

The field of anthropology has many subdisciplines, ranging from archaeology to linguistics, but in this article, I will focus my discussion on cultural anthropology (the subdiscipline I am a part of). Of all its subdisciplines, cultural anthropology most directly relates to ethnography.

Cultural anthropologists seek to understand contemporary living cultures and societies. They have been instrumental in developing and employing ethnography to understand cultures and other social phenomena. Ethnography has become the most common (but not only) way cultural anthropologists have sought to conduct research.

Thus, the relationship between cultural anthropology and ethnography is that between a discipline and its primary tool that has defined what it means to practice that discipline, like proofs define the field of mathematics or experimentation for the hard sciences.

This sentence sums it up:

In general, cultural anthropologists use ethnography to understand cultures.

It illustrates cultural anthropology’s who, what, and how as a discipline and how each of these key components relates to others. 

There are exceptions to this. Cultural anthropologists do not only use ethnography nor does the word culture describe everything they analyze, but this describes the general relationship between cultural anthropology and ethnography.

This is the short explanation of the difference between anthropology and ethnography. Like textbook explanations, it is accurate but abstract and simplistic. It does not get to the heart of what an anthropologist might be really getting at on when they juxtapose the two. In my experience, when people compare the two, they are reflecting on what they consider anthropological ways of thinking and ethnographic ways of thinking. Hence, here is my long answer, which gets to the bottom of what people are really trying to say.

The Long Answer

There are two angles to consider for the long answer: obstinacy towards others outside anthropology using ethnography and the potential for anthropologists to move beyond traditional ethnography. The former is something we anthropologists must overcome and the latter a set of interesting and innovative prospects for both anthropology and ethnography.

Cultural anthropologists have had a unique relationship with ethnography. The discipline has been instrumental in designing, employing, and promoting the methodology, and with the help of anthropologists, the approach has become a valuable way to understand humans, cultures, and societies. At the same time, ethnography has become increasingly popular in other fields, both academic fields like sociology and political science, and in professional fields like UX research and design, marketing, and organizational management. I think this increasing use of the anthropological tool of ethnography has been marvelous, but multiple disciplines suddenly doing “our thing” has catalyzed identity conflict among some anthropologists.

In my experience, when anthropologists make a sharp distinction between anthropology and ethnography, they are primarily processing this identity conflict. For example, in the ensuring conversation with the person I mentioned in the introduction, I learned that he had recently heard some anthropologists condemn several ethnographies in the field of design where he works as “non-anthropological,” making him wonder what on earth the difference was between being “ethnographic” and “anthropological.” Hence, when I told him I was an anthropologist, he figured he would ask me.

Even if it is at best a historical oversimplification, here is a common narrative I will hear within anthropology: several decades ago, ethnography was the primary domain of anthropologists, but now it seems to be taking on a life of its own, with many others from other fields using it. Others deploying ethnography can have fantastic or horrifying results – and everything in between, but often the implicit and/or explicit assumption in the narrative is that people from other disciplines would generally fail to be able to do as good of a job as a trained anthropologist.

Discussions within anthropology of the similarities and differences between anthropology and ethnography – or between so-called anthropological ways of thinking vs ethnographic ways of thinking, anthropological approaches vs ethnographic approaches, or anthropologists vs ethnographers – have become a major staging ground for processing this seeming recent increase in the popularity of ethnography outside of anthropology.

A few notable perspectives have emerged from these discussions. Some cultural anthropologists promote other methodologies within the discipline either in addition to or instead of ethnographic inquiries (e.g. Arturo Escobar). Others emphasize what anthropologists specifically bring to ethnographic research that others who conduct ethnographic research supposedly cannot (e.g. Tim Ingold). Among the anthropologists I have talked to at least in both the academic and professional settings, I have found the latter to be the most common response: arguing that training in anthropology brings a superior way of thinking about society, cultures, and various social phenomena, which allows trained anthropologists to conduct ethnography better.

Exploring how ethnography might be changing as a wider variety of people use it and anthropologists reflecting on how their discipline has shaped ethnography and ethnography shaped their discipline are commendable. But, this particular way of trying to do both seems like a defensive, “us vs them” response.

In addition to fact that humans seem to very frequently tell themselves “us vs them” narratives, material resources are also at play here. By portraying anthropologists as the only people able to perform “authentic” or “quality” ethnographies, anthropologists can demand competitive resources from potential funders, clients, colleagues, organizations and/or students. This could range from funding for their academic department to being the ones who win the job or contract to conduct qualitative user research at a company.

Whatever factors reinforce this type of defensive response, I believe we anthropologists should instead celebrate the increasing flowering of ethnography and embrace how others might reformulate the methodology to meet their needs. It is an opportunity to crosspollinate and enliven what it means to do ethnography.

A final response by cultural anthropologists has been to rethink traditional ethnography and/or anthropological research itself. For example, Morten Axel Pedersen has argued for a reimagining of what ethnography is in a way that could incorporate data science and machine learning techniques into the ethnographic toolkit and anthropological research (something I have argued for here, here, and here as well). I believe this reassessment of traditional ethnography has a lot of potential for innovative, outside-the-box anthropological research.

Unfortunately, the former chest-pumping explanations of why non-anthropological ethnographies are inferior to our work has been more common than (what I, at least, would consider) this more fruitful conversation. Its bombastic thunder can drawn out the other perspectives.

Conclusion

I can certainly see how non-anthropologists seeking to understand (and maybe employ) ethnography could become confused when they encounter these debates among anthropologists.

To anyone who has been so confused, I hope this article provides – what I see as at least – the wider context for why anthropologists often juxtapose their discipline with ethnography. As anthropologists process how ethnography is increasingly flowering outside of their discipline, I also hope the negative aspects of our response will not turn you away from what is a powerful methodology to understand people, cultures, and societies.

Photo credit #1: Raquel Martínez at https://unsplash.com/photos/SQM0sS0htzw

Photo credit #2: Skitterphoto at https://www.pexels.com/photo/book-page-1005324/

Photo credit #3: klimkin at https://pixabay.com/photos/hand-gift-bouquet-congratulation-1549399/

Photo credit #4: PublicDomainPictures at https://pixabay.com/photos/garden-flowers-butterfly-monarch-17057/

Why Business Anthropologists Should Reconsider Machine Learning

high angle photo of robot
Photo by Alex Knight on Pexels.com

This article is a follow-up to my previous article – “Integrating Ethnography and Data Science” – written specifically for anthropologists and other ethnographers.

As an anthropologist and data scientist, I often feel caught in the middle two distinct warring factions. Anthropologists and data scientists inherited a historic debate between quantitative and qualitative methodologies in social research within modern Western societies. At its core, this debate has centered on the difference between objective, prescriptive, top-downtechniques and subjective, sitautional, flexible, descritpive bottom-up approaches.[i] In this ensuing conflict, quantative research has been demarcated into the top-down faction and qualitative research within the bottom-up faction to the detriment of understanding both properly.

In my experience on both “sides,” I have seen a tendency among anthropologists to lump all quantitative social research as proscriptive and top-down and thus miss the important subtleties within data science and other quantitative techniques. Machine learning techniques within the field are a partial shift towards bottom-up, situational and iterative quantitative analysis, and business anthropologists should explore what data scientists do as a chance to redevelop their relationship with quantitative analysis.

Shifts in Machine Learning

Text Box: Data science is in a uniquely formative and adolescent period.

Shifts within machine learning algorithm development give impetus for incorporating quantitative techniques that are local and interpretive. The debate between top-down vs. bottom-up knowledge production does not need – or at least may no longer need– to divide quantitative and qualitative techniques. Machine learning algorithms “leave open the possibility of situated knowledge production, entangled with narrative,” a clear parallel to qualitative ethnographic techniques.[ii]

At the same time, this shift towards iterative and flexible machine learning techniques is not total within data science: aspects of top-down frameworks remain, in terms of personnel, objectives, habits, strategies, and evaluation criteria. But, seeds of bottom-up thinking definitely exist prominently within data science, with the potential to significantly reshape data science and possibly quantitative analysis in general.

As a discipline, data science is in a uniquely formative and adolescent period, developing into its “standard” practices. This leads to significant fluctuations as the data scientist community defines its methodology. The set of standard practices that we now typically call “traditional” or “standard” statistics, generally taught in introductory statistics courses, developed over a several decade period in the late nineteenth and early twentieth century, especially in Britain.[iii] Connected with recent computer technology, data science is in a similarly formative period right now – developing its standard techniques and ways of thinking. This formative period is a strategic time for anthropologists to encourage bottom-up quantative techniques.

Conclusion

Business anthropologists could and should be instrumental in helping to develop and innovatively utilize these situational and iterative machine learning techniques. This is a strategic time for business anthropologists to do the following:

  1. Immerse themselves into data science and encourage and cultivate bottom-up quantative machine learning techniques within data science
  2. Cultivate and incorporate (when applicable) situational and iterative machine learning approaches in its ethnographies

For both, anthropologists should use the strengths of ethnographic and anthropological thinking to help develop bottom-up machine learning that is grounded in flexible to specific local contexts. Each requires business anthropologists to reexplore their relationship with data science and machine learning instead of treating it as part of an opposing “methodological clan.” [iv]


[i] Nafus, D., & Knox, H. (2018). Ethnography for a Data-Saturated World. Manchester: Manchester University Press, 11-12

[ii] Ibid, 15-17.

[iii] Mackenzie, D. (1981). Statistics in Britain 1865–1930: The Social Construction of Scientific Knowledge. Edinburgh: Edinburgh University Press.

[iv] Seaver, N. (2015). Bastard Algebra. In T. Boellstorff, & B. Maurer, Data, Now Bigger and Better (pp. 27-46). Chicago: Prickly Paradigm Press, 39.

Writing Ethnographic Findings as Software Specs

When working as an ethnographer with software engineers, I have found formatting my write-up for any ethnographic inquiry I conducted as software specs incredibly valuable. In general, I prefer to incarnate any ethnographic report I make into the cultural context I am conducting the research for, and this is one example of how to do that.

Many find the academic essay prose style stifling and unintelligible, so why limit yourself to that format like most ethnographic write-ups tend to be when conducting work for and with other parties? Like Schaun Wheeler said in this interview, in the professional world, pdf reports are often where thoughts go to die.

Most often when I am conducting ethnographic research with software engineers, I am doing some kind of user research on a potential or actual software product: trying to understand how users engage with a software or set of softwares to help engineers improve the design to better meet users’ needs. When doing this, I most often bullet my findings by topic and suggested change, ordering them based on importance and complexity. This allows software engineers to easily transfer the insights into actionable ideas for how to improve the software design.

For example, a software company asked me to conduct ethnography to understand how users engaged with a beta version of an app. For this project, I broke down ethnographic insights into advantages of the app and common pitfalls encountered. I illustrated each item on the list with stories and quotes from users. I ordered the points based on importance and difficulty addressing (aka as either important and easy to fix, not important but easy to fix, important but not easy to fix, and not important but not easy to fix). On each list, I focused on the item itself, but sometimes I might also mention potential solutions, particularly when users proposed specific ideas for how to resolve something they encountered. Only occasionally did I give my own suggestions. This allowed software engineers to think through the ethnographic findings and translate them into software specs. They liked the report formatting so much the CTO of the company came to me personally to tell me I had the most profound and useful documentation he had seen. 

I have found describing ethnographic findings as design specs has been incredibly helpful in the tech world. It allows the immersion of ethnographic insights into engineering contexts and facilitates the development of actional insights and designs. Instead of defaulting to a long essay or manuscript, ethnographers should think carefully about the best way to format their findings to make sure it is approachable, relatable, and useful for the audience(s) that will look at and use it.

Breaking into Tech: A Career Workshop

code projected over woman
Photo by ThisIsEngineering on Pexels.com

Earlier this week, Matt Artz, Astrid Countee, and I ran a workshop at the American Anthropological Association’s 2020 annual conference entitled “Breaking into Tech.” We discussed strategies for anthropologists interested in working in the tech world.

Here is the presentation for anyone who might find it useful but could not attend:

Thank you, Astrid and Matt, for your help in developing and running this workshop.  

Resources on Integrating Data Science and Ethnography

Here is a list of resources about integrating data science and ethnography. Even though it is an up and coming field without a consistent list of publications, several fascinating and insightful resources do exist.

If there are any resources about integrating data science and ethnography that you have found useful, feel free to share them as well.

General Overviews:

  • Curran, John. “Big Data or ‘Big Ethnographic Data’? Positioning Big Data within the Ethnographic Space.” EPIC (2013). (Found here: https://www.epicpeople.org/big-data-or-big-ethnographic-data-positioning-big-data-within-the-ethnographic-space/)
  • Patel, Neal. “For a Ruthless Criticism of Everything Existing: Rebellion Against the Quantitative-Qualitative Divide.” EPIC (2013): 43-60.
  • Nick Seaver. “Bastard Algebra.” Boellstorff, Tom and Bill Maurer. Data, Now Bigger and Better. Chicago: Prickly Paradigm Press, 2015. 27-46.
  • Slobin, Adrian and Todd Cherkasky. “Ethnography in the Age of Analytics.” EPIC (2010).
  • Nafus, Dawn and Tye Rattenbury. Data Science and Ethnography: What’s Our Common Ground, and Why Does It Matter? 7 3 2018. <https://www.epicpeople.org/data-science-and-ethnography/>.
  • Nick Seaver. “The nice thing about context is that everyone has it.” Media, Culture & Society (2015).

Books:

  • Nafus, Dawn and Hannah Knox. Ethnography for a Data-Saturated World. Manchester: Manchester Univeristy Press, 2018.
  • Boellstorff, Tom and Bill Maurer. Data, Now Bigger and Better! Chicago: Prickly Paradigm Press, 2015.
  • Mackenzie, Adrian. Machine Learners: Archaeology of a Data Practice. Cambridge: The MIT Press, 2017.

Examples and Case Studies:

  • “Autonomous Drive: Teaching Cars Human Behaviour” by Melissa Cefkin on the Youtube Channel DrivingTheNation: https://www.youtube.com/watch?v=6koKuDegHAM
  • Eslami, Motahhare, et al. “First I “like” it, then I hide it: Folk Theories of Social Feeds.” Curation and Algorithms (2016).
  • Giaccardi, Elisa, Chris Speed and Neil Rubens. “Things Making Things: An Ethnography of the Impossible.” (2014).
  • Elish, M. “The Stakes of Uncertainty: Developing and Integrating Machine Learning in Clinical Care.” EPIC (2018).
  • Madsen, Matte My, Anders Blok and Morten Axel Pedersen. “Transversal collaboration: an ethnography in/of computational social science.” Nafus, Dawn. Ethnography for a Data-saturated World. Manchester: Manchester Univeristy Press, 2018.
  • Thomas, Suzanne, Dawn Nafus and Jamie Sherman. “Algorithms as fetish: Faith and possibility in algorithmic work.” Big Data & Society (2018): 1-11.

Articles and Blog Posts:

My Own Articles on This Website:

Podcasts and Lectures:

Ethical Considerations:

UX Research and Business Anthropology Are Central within Applied Anthropology

photo of woman wearing turtleneck top
Photo by Ali Pazani on Pexels.com

This is a research paper I wrote for a master’s course on Applied Anthropology at the University of Memphis. The overall master’s program sought to train students in applied anthropology, and the goal of this course was to teach the foundations of what applied anthropology is, in contrast to other types of anthropology.

Even though I found the course interesting, its curriculum lacked the readings and perspectives of applied anthropologists in the business world. As I discuss in the paper, statistically speaking, a significant number of applied anthropologists (and a University of Memphis’s applied anthropology program alum) work in the business sector, so excluding them leaves out what might be the largest group of applied anthropologists from their own field. I wrote this essay as a subtle nudge to encourage the course designers to add the works of business anthropologists, particularly UX researchers, into their curriculum.

Due to the lack of resources by applied business anthropologists in the curriculum, I had to assemble my own resources entirely by myself. Other applied anthropologists have told me they have encountered this as well. So, hopefully, in addition to the essay potentially providing helpful analysis of applied business anthropology, its bibliography might also provide a starting collection of business anthropology resources for you to explore.

Loader Loading…
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab

Download [274.72 KB]

Using Data Science and Ethnography to Build a Show Rate Predictor

I recently integrated ethnography and data science to develop a Show Rate Predictor for an (anonymous) hospital system. Many readers have asked for real-world examples of this integration, and this project demonstrates how ethnography and data science can join to build machine learning-based software that makes sense to users and meets their needs.

Part 1: Scoping out the Project

A particular clinic in the hospital system was experiencing a large number of appointment no-shows, which produced wasted time, frustration, and confusion for both its patients and employees. I was asked to use data science and machine learning to better understand and improve their scheduling.

I started the project by conducting ethnographic research into the clinic to learn more about how scheduling occurs normally, what effect it was having on the clinic, and what driving problems employees saw. In particular, I observed and interviewed scheduling assistants to understand their day-to-day work and their perspectives on no-shows.

One major lesson I learned through all this was that when scheduling an appointment, schedulers are constantly trying to determine how many people to schedule on a given doctor’s shift to ensure the right number of people show up. For example, say 12-14 patients is a good number of patients for Dr. Rodriguez’s (made up name) Wednesday morning shift. When deciding whether to schedule an appointment for the given patient with Dr. Rodriguez on an upcoming Wednesday, the scheduling assistants try to determine, given the appointments currently scheduled then, whether they can expect 12-14 patients to show up. This was often an inexact science. They would often have to schedule 20-25 patients on a particular doctor’s shift to ensure their ideal window of 12-14 patients would actually come that day. This could create the potential for chaos, however, where too many patients arriving on some days and too few on others.

This question – how many appointments can we expect or predict to occur on a given doctor’s shift – became my driving question to answer with machine learning. After checking in with the various stakeholders at the clinic to make sure this was in fact an important and useful question to answer with machine learning, I started building.

Part 2: Building the Model

Now that I had a driving, answerable question, I decided to break it down into two sequential machine learning models:

  1. The first model learned to predict the probability that a given appointment would occur, learning from the history of occurring or no-show appointments.
  2. The second model, using the appointment probabilities from the first model, estimated how many appointments might occur for every doctors’ shift.

The first model combined three streams of data to assess the no-show probability: appointment data (such as how long ago it was scheduled, type of appointment, etc.); patient information, especially past appointment history; and doctor information. I performed extensive feature selection to determine the best subset of variables to use and tested several types of machine learning models before settling on gradient boosting.

The second model used the probabilities in the first model as input data to predict how many patients to expect to come on each doctors’ shift. I settled on a neural network for the model.

Part 3: Building an App

Next, I worked with the software engineers on my team to develop an app to employ these models in real time and communicate the information to schedulers as they scheduled appointments. My ethnographic research was invaluable for developing how to construct the app.

On the back end, the app calculated the probability that all future appointments would occur, updating with new calculations for newly scheduled or edited appointments. Once a week, it would incorporate that week’s new appointment data and shift attendance to each model’s training data and update those models accordingly.  

Through my ethnographic research, I observed how schedulers approached scheduling appointments, including what software they used in the process and how they used each. I used that to determine the best ways to communicate that information, periodically showing my ideas to the schedulers to make sure my strategy would be helpful.

I constructed an interface to communicate the information that would complement the current software they used. In addition to displaying the number of patients expected to arrive, if the machine learning algorithm was predicting that a particular shift was underbooked, it would mark the shift in green on the calendar interface; yellow if the shift was projected to have the ideal number of patients, and red if already expected have too many patients. The color-coding allowed easy visualization of the information in the moment: when trying to find an appointment time for a patient, they could easily look for the green shifts or yellow if they had to, but steer clear of the red. When zooming in on a specific shift, each appointment would be color-coded (likely, unlikely, and in the middle) as well based on the probability that it would occur.

Conclusion

This is one example of a projects that integrates data science and ethnography to build a machine learning app. I used ethnography to construct the app’s parameters and framework. It tethered the app in the needs of the schedulers, ensuring that the machine learning modeling I developed was useful to those who would use it. Frequent check-ins before each step in their development also helped confirm that my proposed concept would in fact help meet their needs.

My data science and machine learning expertise helped guide me in the ethnographic process as well. Being an expert in how machine learning worked and what sorts of questions it could answer allowed me to easily synthesize the insights from my ethnographic inquiries into buildable machine learning models. I understood what machine learning was capable (and not capable) of doing, and I could intuitively develop strategic ways to employ machine learning to address issues they were having.

Hence, my dual role as an ethnography and data scientist benefitted the project greatly. My listening skills from ethnography enabled me to uncover the underlying questions/issues schedulers faced, and my data science expertise gave me the technical skills to develop a viable machine learning solution. Without listening patiently through extensive ethnography, I would not have understood the problem sufficiently, but without my data science expertise, I would have been unable to decipher which questions(s) or issue(s) machine learning could realistically address and how.

This exemplifies why a joint expertise in data science and ethnography is invaluable in developing machine learning software. Two different individuals or teams could complete each separately – an ethnographer(s) analyze the users’ needs and a data scientist(s) then determine whether machine learning modeling could help. But this seems unnecessarily disjointed, potentially producing misunderstanding, confusion, and chaos. By adding an additional layer of people, it can easily lead to either the ethnographer(s) uncovering needs way too broad or complex for a machine learning-based solution to help or the data scientist(s) trying to impose their machine learning “solution” to a problem the users do not have.

Developing expertise in both makes it much easier to simultaneously understand the problems or questions in a particular context and build a doable data science solution.

Photo credit #1: DarkoStojanovic at https://pixabay.com/photos/medical-appointment-doctor-563427/  

Photo credit #2: geralt at https://pixabay.com/illustrations/time-doctor-doctor-s-appointment-481445/

Photo credit #3: Pixabay at https://www.pexels.com/photo/light-road-red-yellow-46287/