Using Data Science and Ethnography to Build a Show Rate Predictor

I recently integrated ethnography and data science to develop a Show Rate Predictor for an (anonymous) hospital system. Many readers have asked for real-world examples of this integration, and this project demonstrates how ethnography and data science can join to build machine learning-based software that makes sense to users and meets their needs.

Part 1: Scoping out the Project

A particular clinic in the hospital system was experiencing a large number of appointment no-shows, which produced wasted time, frustration, and confusion for both its patients and employees. I was asked to use data science and machine learning to better understand and improve their scheduling.

I started the project by conducting ethnographic research into the clinic to learn more about how scheduling occurs normally, what effect it was having on the clinic, and what driving problems employees saw. In particular, I observed and interviewed scheduling assistants to understand their day-to-day work and their perspectives on no-shows.

One major lesson I learned through all this was that when scheduling an appointment, schedulers are constantly trying to determine how many people to schedule on a given doctor’s shift to ensure the right number of people show up. For example, say 12-14 patients is a good number of patients for Dr. Rodriguez’s (made up name) Wednesday morning shift. When deciding whether to schedule an appointment for the given patient with Dr. Rodriguez on an upcoming Wednesday, the scheduling assistants try to determine, given the appointments currently scheduled then, whether they can expect 12-14 patients to show up. This was often an inexact science. They would often have to schedule 20-25 patients on a particular doctor’s shift to ensure their ideal window of 12-14 patients would actually come that day. This could create the potential for chaos, however, where too many patients arriving on some days and too few on others.

This question – how many appointments can we expect or predict to occur on a given doctor’s shift – became my driving question to answer with machine learning. After checking in with the various stakeholders at the clinic to make sure this was in fact an important and useful question to answer with machine learning, I started building.

Part 2: Building the Model

Now that I had a driving, answerable question, I decided to break it down into two sequential machine learning models:

  1. The first model learned to predict the probability that a given appointment would occur, learning from the history of occurring or no-show appointments.
  2. The second model, using the appointment probabilities from the first model, estimated how many appointments might occur for every doctors’ shift.

The first model combined three streams of data to assess the no-show probability: appointment data (such as how long ago it was scheduled, type of appointment, etc.); patient information, especially past appointment history; and doctor information. I performed extensive feature selection to determine the best subset of variables to use and tested several types of machine learning models before settling on gradient boosting.

The second model used the probabilities in the first model as input data to predict how many patients to expect to come on each doctors’ shift. I settled on a neural network for the model.

Part 3: Building an App

Next, I worked with the software engineers on my team to develop an app to employ these models in real time and communicate the information to schedulers as they scheduled appointments. My ethnographic research was invaluable for developing how to construct the app.

On the back end, the app calculated the probability that all future appointments would occur, updating with new calculations for newly scheduled or edited appointments. Once a week, it would incorporate that week’s new appointment data and shift attendance to each model’s training data and update those models accordingly.  

Through my ethnographic research, I observed how schedulers approached scheduling appointments, including what software they used in the process and how they used each. I used that to determine the best ways to communicate that information, periodically showing my ideas to the schedulers to make sure my strategy would be helpful.

I constructed an interface to communicate the information that would complement the current software they used. In addition to displaying the number of patients expected to arrive, if the machine learning algorithm was predicting that a particular shift was underbooked, it would mark the shift in green on the calendar interface; yellow if the shift was projected to have the ideal number of patients, and red if already expected have too many patients. The color-coding allowed easy visualization of the information in the moment: when trying to find an appointment time for a patient, they could easily look for the green shifts or yellow if they had to, but steer clear of the red. When zooming in on a specific shift, each appointment would be color-coded (likely, unlikely, and in the middle) as well based on the probability that it would occur.

Conclusion

This is one example of a projects that integrates data science and ethnography to build a machine learning app. I used ethnography to construct the app’s parameters and framework. It tethered the app in the needs of the schedulers, ensuring that the machine learning modeling I developed was useful to those who would use it. Frequent check-ins before each step in their development also helped confirm that my proposed concept would in fact help meet their needs.

My data science and machine learning expertise helped guide me in the ethnographic process as well. Being an expert in how machine learning worked and what sorts of questions it could answer allowed me to easily synthesize the insights from my ethnographic inquiries into buildable machine learning models. I understood what machine learning was capable (and not capable) of doing, and I could intuitively develop strategic ways to employ machine learning to address issues they were having.

Hence, my dual role as an ethnography and data scientist benefitted the project greatly. My listening skills from ethnography enabled me to uncover the underlying questions/issues schedulers faced, and my data science expertise gave me the technical skills to develop a viable machine learning solution. Without listening patiently through extensive ethnography, I would not have understood the problem sufficiently, but without my data science expertise, I would have been unable to decipher which questions(s) or issue(s) machine learning could realistically address and how.

This exemplifies why a joint expertise in data science and ethnography is invaluable in developing machine learning software. Two different individuals or teams could complete each separately – an ethnographer(s) analyze the users’ needs and a data scientist(s) then determine whether machine learning modeling could help. But this seems unnecessarily disjointed, potentially producing misunderstanding, confusion, and chaos. By adding an additional layer of people, it can easily lead to either the ethnographer(s) uncovering needs way too broad or complex for a machine learning-based solution to help or the data scientist(s) trying to impose their machine learning “solution” to a problem the users do not have.

Developing expertise in both makes it much easier to simultaneously understand the problems or questions in a particular context and build a doable data science solution.

Photo credit #1: DarkoStojanovic at https://pixabay.com/photos/medical-appointment-doctor-563427/  

Photo credit #2: geralt at https://pixabay.com/illustrations/time-doctor-doctor-s-appointment-481445/

Photo credit #3: Pixabay at https://www.pexels.com/photo/light-road-red-yellow-46287/  

You Know You’re a Business Anthropologist If… (Funny)

You know you’re a business anthropologist if…

  1. You ask at least 500 follow-up questions when your supervisor gives you a project to really understand the full context. 
  2. You have a prepared spiel about how what you studied was different than digging up Mayan artifacts (unless that happened to be what you did).
  3. You constantly ask people how they feel when completing a task or what they think of the process.
  4. You try to reimagine and redesign any object or process that your organization will let you get your hands on.  
  5. You have critiqued every organization that has hired you.
  6. You have the strangest knick-knacks on your desk from around the world.
  7. You take triple the notes anyone else does in a meeting, recording in detail what everyone’s statements and body postures.
  8. In regular conversation, you interrogate your colleagues like you’re leading an interview.
  9. You frequent your company’s “watercooler spots” – informal places to gather to hang out. This is where the real work happens.
  10. You rage against top-down procedures and formal hierarchy every time you encounter it.
  11. You have resolved to never use PowerPoint for your presentations.
  12. Any time you hear a French word, your mind immediately goes to the French theorist with the most similar sounding name.

I intend this as a fun little exercise thinking about the quirks and idiosyncrasies of working as an anthropologist in the business world. 

Photo Credit: Toa Heftiba at https://unsplash.com/photos/FV3GConVSss

Three Situations When Ethnography Is Useful in a Professional Setting

This is a follow-up to my previous article, “What Is Ethnography,” outlining ways ethnography is useful in professional settings.

To recap, I defined ethnography as a research approach that seeks “to understand the lived experiences of a particular culture, setting, group, or other context by some combination of being with those in that context (also called participant-observation), interviewing or talking with them, and analyzing what is produced in that context.”

Ethnography is a powerful tool, developed by anthropologists and other social scientists over the course of several decades. Here are three types of situations in professional settings when I have found to use ethnography to be especially powerful:

1. To see the given product and/or people in action
2. When brainstorming about a design
3. To understand how people navigate complex, patchwork processes

Situation #1: To See the Given Product and/or People in Action

Ethnography allows you to witness people in action: using your product or service, engaging in the type of activity you are interested, or in whatever other situation you are interested in studying.

Many other social science research methods involve creating an artificial environment in which to observe how participants act or think in. Focus groups, for example, involve assembling potential customers or users into a room: forming a synthetic space to discuss the product or service in question, and in many experimental settings, researchers create a simulated environment to control for and analyze the variables or factors they are interested in.

Ethnography, on the other hand, centers around observing and understanding how people navigate real-world settings. Through it, you can get a sense for how people conduct the activity for which you are designing a product or service and/or how people actually use your product or service.

For example, if you want to understand how people use GPS apps to get around, one can see how people use the app “in the wild:” when rushing through heavy traffic to get to a meeting or while lost in the middle of who knows where. Instead of hearing their processed thoughts in a focus group setting or trying to simulate the environment, you can witness what the tumultuousness yourself and develop a sense for how to build a product that helps people in those exact situations.

Situation #2: When Brainstorming about a New Product Design

Ethnography is especially useful during the early stages of designing a product or service, or during a major redesign. Ethnography helps you scope out the needs of your potential customers and how they approach meeting said needs. Thus, it helps you determine how to build a product or service that addresses those needs in a way that would make sense for your users.

During such initial stages of product design, ethnography helps determine the questions you should be asking. Many have a tendency during these initial stages to construct designs based on their own perception of people’s needs and desires and miss what the customers’ or users’ do in fact need and desire. Through ethnography, you ground your strategy in the customers’ mindsets and experiences themselves.

The brainstorming stages of product development also require a lot of flexibility and adaptability: As one determines what the product or service should become, one must be open to multiple potential avenues. Ethnography is a powerful tool for navigating such ambiguity. It centers you on the users, their experiences and mindsets, and the context which they might use the product or service, providing tools to ask open-ended questions and to generate new and helpful ideas for what to build.

Situation #3: To Understand How People Navigate Complex, Patchwork Processes

At a past company, I analyzed how customer service representatives regularly used the various software systems when talking with customers. Over the years, the company had designed and bought various software programs, each to perform a set of functions and with unique abilities, limitations, and quirks. Overtime, this created a complex web of interlocking apps, databases, and interfaces, which customer service representatives had to navigate when performing their job of monitoring customer’s accounts. Other employees described the whole scene as the “Wild West:” each customer service representative had to create their own way to use these software systems while on the phone with a (in many cases disgruntled) customer.

Many companies end up building such patchwork systems – whether of software, of departments or teams, of physical infrastructure, or something else entirely – built by stacking several iterations of development overtime until, they become a hydra of complexity that employees must figure out how to navigate to get their work done.

Ethnography is a powerful tool for making sense of such processes. Instead of relying on official policies for how to conduct various actions and procedures, ethnography helps you understand and make sense of the unofficial and informal strategies people use to do what they need. Through this, you can get a sense for how the patchwork system really works. This is necessary for developing ways to improve or build open such patchwork processes.

In the customer service research project, my task was to develop strategies to improve the technology customer service representatives used as they talked with customers. Seeing how representatives used the software through ethnographic research helped me understand and focus the analysis on their day-to-day needs and struggles.

Conclusion

Ethnography is a powerful tool, and the business world and other professional settings have been increasingly realizing this (c.f. this and this ). I have provided three circumstances where I have personally found ethnography to be invaluable. Ethnography allows you to experience what is happening on the ground and through that to shape and inform the research questions we ask and recommendations or products we build for people in those contexts.

Photo credit #1: DariusSankowski at https://pixabay.com/photos/navigation-car-drive-road-gps-1048294/

Photo credit #2: AbsolutVision at https://unsplash.com/photos/82TpEld0_e4

Photo credit #3: Tony Wan at https://unsplash.com/photos/NSXmh14ccRU

Anthropologist in I.T. (Comic, Funny)

Here’s a fun little comic about some of my experiences working as an anthropologist in I.T. It’s actually a blast.

I wrote this comic for the University of Memphis Anthropology Department, where they featured it on their Fall 2018 newsletter.

Thank you, Rusty Haner, for illustrating the panels.

Methodological Complementarianism: Being the Mix in Mixed Methods

photo of women at the meeting
Photo by RF._.studio on Pexels.com

I wrote this essay for my midterm for a course I took on conducting program evaluation as an anthropologist taught by Dr. Michael Duke at the University of Memphis Anthropology Master’s program. In it, I synthesize Donna Mertens’s discussion of employing mixed methods research for program evaluation work in her book, Mixed Methods Design in Evaluation, as a way to present the need for what I call methodological complementarianism.

Methodological complementarianism involves complementing those on the team one is working with by advancing for the complementary perspectives that the team needs. When conducting transdisciplinary work as applied anthropologists, instead of explicitly or implicitly seeking to maintain a “pure” anthropological approach, I think we should have a greater willingness to produce something anew in that environment, even if it no longer fits the “pure” boundaries of proper anthropology or ethnography but rather some kind of hybrid emerging out of the needs of the situation. Methodological complementarianism is one practical way to do that I have been exploring.

Recently Published Article: “Anthropology by Data Science”

tea set and newspaper placed on round table near comfortable chair
Photo by Ekrulila on Pexels.com

I am pleased to announce that the Annals of Anthropological Practice has accepted my article “Anthropology by Data Science.” https://anthrosource.onlinelibrary.wiley.com/doi/10.1111/napa.12169. In it, I reflect on the relationship anthropologist have cultivated with data science as a discipline and the importance of integrating machine learning techniques into ethnographic practice.

Annals of Anthropological Practice is overseen by the National Association for the Practice of Anthropology (NAPA) within the American Anthropological Association. Thank you, NAPA, for publishing my article and thank you to all the unnamed editors and reviewers in the process.

Interdisciplinary Anthropology and Data Science Master’s Thesis: A Quick and Dirty Project Summary

This is a quick and dirty summary of my master’s practicum research project with Indicia Consulting over the summer of 2018. For anyone interested in more detail, here is a more detailed report, and here is the final report with Indicia. 

Background

My practicum was the sixth stage of a several year-long research project. The California Energy Commission commissioned this larger project to understand the potential relationship between individual energy consumption and technology usage. In stages one through five, we isolated certain clusters of behavior and attitudes around new technology adoption – which Indicia called cybersensitivity – and demonstrated that cybersensitivity tended to associate with a willingness to adopt energy-saving technology like smart meters.

This led to a key question: How can one identify cybersensivity among a broader population such as a community, county, or state? Answering this question was the main goal of my practicum project.

In the past stages of the research project, the team used ethnographic research to establish criteria for whether someone was a cybersensitive based on several hours of interviews and observations about their technology usage. These interviews and observations certainly helped the research team analyze behavioral and attitudinal patterns, determine what patterns were significant, and develop those into the concept of cybersensitivity, but they are too time- and resource-intensive to perform with an entire population. One generally does not have the ability to interview everyone in a community, county, or state. I sought to address this directly in my project.

TaskTimelineTask NameResearch TechniqueDescription
Task 1June 2015-Sept 2018General Project TasksAdministrative (N/A)Developed project scope and timeline, adjusting as the project unfolds
Task 2July 2015 – July 2016Documenting and analyzing emerging attitudes, emotions, experiences, habits, and practices around technology adoptionSurveyConducted survey research to observe patterns of attitudes and behaviors among cybersensitives/awares.
Task 3Sept 2016 – Dec 2016Identifying the attributes and characteristics and psychological drivers of cybersensitivesInterviews and Participant-ObservationConducted in-depth interviews and observations coding for psych factor, energy consumption attitudes and behaviors, and technological device purchasing/usage.
Task 4*Sept 2016 – July 2017Assessing cybersensitives’ valence with technologyStatistical AnalysisTested for statistically significant differences in demographics, behaviors, and beliefs/attitudes between cyber status groups
Task 5Aug 2017 – Dec 2018  Developing critical insights for supporting residential engagement in energy efficient behaviorsStatistical AnalysisAnalyzed utility data patterns of study participants, comparing it with the general population.
Task 6March 2018 – Aug 2018Recommending an alternative energy efficiency potential modelDecision Tree ModelingConstructed decision tree models to classify an individual’s cyber status

Project Goal

The overall goal for the project was to produce a scalable method to assess whether someone exhibits cybersensitivity based on data measurable across an entire population. In doing this, the project also helped address the following research needs:

  1. Created a method to further to scale across a larger population, assessing whether cybersensitives were more willing to adopt energy saving technologies across a community, county, or state
  2. Provided the infrastructure to determine how much promoting energy-saving campaigns targeting cybersensitives specifically would reduce energy consumption in California
  3. Helped the California Energy Commission determine the best means to reach cybersensitives for specific energy-saving campaigns

The Project

I used machine learning modeling to create a decision-making flow to isolate cybersensitives in a population. Random forests and decision trees produced the best models for Indicia’s needs: random forests in accuracy and robustness and decision trees in human decipherability. Through them, I created a programmable yet human-comprehensible framework to determine whether an individual is cybersensitive based on behaviors and other characteristics that an organization could be easily assess within a whole population. Thus, any energy organization could easily understand, replicate, and further develop the model since it was both easy for humans to read and encodable computationally. This way organizations could both use and refine it for their purposes.

Conclusion

This is a quick overview of my master’s practicum project. For more details on what modeling I did, how I did it, what results it produced, and how it fit within the wider needs of the multi-year research project, please see my full report.

I really appreciated the opportunity it posed to get my hands dirty integrating ethnography and data science to help address a real-world problem. This summary only scratches the surface of what Indicia did with the Californian Energy Commission to encourage sustainable energy usage societally. Hopefully, though, it will inspire you to integrate ethnography and data science to address whatever complex questions you face. It certainly did for me.

Thank you to Susan Mazur-Stommen and Haley Gilbert for your help in organizing and completing the project. I would like to thank my professorial committee at the University of Memphis – Dr. Keri Brondo, Dr. Ted Maclin, Dr. Deepak Venugopal, and Dr. Katherine Hicks – for their academic support as well.

The Anthropology of Machine Learning

In the spring of 2018, I researched how anthropologists and related social scholars have analyzed data science and machine learning for my Master’s in Anthropology at the University of Memphis. For the project, I assessed the anthropological literature on data science and machine learning to date and explore potential connections between anthropology and data science, based on my perspective as a data scientist and anthropologist. Here is my final report.

Thank you, Dr. Ted Maclin, for your help overseeing and assisting this project.

Response-ability Conference Talk

On May 21st, Astrid Countee and I presented at the 2021 Response-ability Conference. We discussed strategies for leveraging data science and anthropology in the tech sector to help address societal issues. The Response-ability’s overall goal was to explore how anthropologists and software specialists in the tech sector to understand and tackle social issues.

Here is an abstract for Astrid’s and my talk:

In the coming months, Response-ability plans to publish our presentation, so if you are interested in watching it, please stay tuned until then. When they make the videos accessible, they should post them here: https://response-ability.tech/2021-summit-videos/.

I appreciated the whole experience. Thank you to everyone who helped make the conference happen, and Astrid for doing this talk with me.

Anthropology by Data Science: The EPIC Project with Indicia Consulting as an Exploratory Case Study

This is my practicum report with Indicia Consulting. In lieu of a master’s thesis, the University of Memphis Department of Anthropology required that we master’s students conduct a practicum project. For this, we had to partner with an organization and complete a 300+ hour anthropological research project based on the organization’s needs and our skills and interests. My practicum project was Indicia’s EPIC Project with the California Energy Commission (see this link and this link for more details on the EPIC Project). In this report, I outline potential ways to integrate ethnographic/anthropological and data science research in professional settings.

In November 2019, the American Anthropological Association’s Committee for the Anthropology of Science, Technology, and Computing (CASTAC) awarded me the David Hakken Graduate Student Prize for innovative science and technology scholarship.

Full Report:

Loader Loading…
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab

Download [1.56 MB]

The Anthropology Department also required that you publicly present your practicum research to the University of Memphis campus. This PowerPoint summarizes my practicum project. If you are not keen to read the 99 page full report, this is a much shorter alternative:

If you are interested in learning more about the project, please check out the following:

  1. Indicia Consulting’s Final Research Report with the California Energy Commission
  2. My Presentation at the 2019 Memphis Data Conference for Data Scientists Specifically