Applied Research Archives - Page 5 of 6

Using Data Science and Ethnography to Build a Show Rate Predictor

I recently integrated ethnography and data science to develop a Show Rate Predictor for an (anonymous) hospital system. Many readers have asked for real-world examples of this integration, and this project demonstrates how ethnography and data science can join to build machine learning-based software that makes sense to users and meets their needs.

Part 1: Scoping out the Project

A particular clinic in the hospital system was experiencing a large number of appointment no-shows, which produced wasted time, frustration, and confusion for both its patients and employees. I was asked to use data science and machine learning to better understand and improve their scheduling.

I started the project by conducting ethnographic research into the clinic to learn more about how scheduling occurs normally, what effect it was having on the clinic, and what driving problems employees saw. In particular, I observed and interviewed scheduling assistants to understand their day-to-day work and their perspectives on no-shows.

One major lesson I learned through all this was that when scheduling an appointment, schedulers are constantly trying to determine how many people to schedule on a given doctor’s shift to ensure the right number of people show up. For example, say 12-14 patients is a good number of patients for Dr. Rodriguez’s (made up name) Wednesday morning shift. When deciding whether to schedule an appointment for the given patient with Dr. Rodriguez on an upcoming Wednesday, the scheduling assistants try to determine, given the appointments currently scheduled then, whether they can expect 12-14 patients to show up. This was often an inexact science. They would often have to schedule 20-25 patients on a particular doctor’s shift to ensure their ideal window of 12-14 patients would actually come that day. This could create the potential for chaos, however, where too many patients arriving on some days and too few on others.

This question – how many appointments can we expect or predict to occur on a given doctor’s shift – became my driving question to answer with machine learning. After checking in with the various stakeholders at the clinic to make sure this was in fact an important and useful question to answer with machine learning, I started building.

Part 2: Building the Model

Now that I had a driving, answerable question, I decided to break it down into two sequential machine learning models:

The first model learned to predict the probability that a given appointment would occur, learning from the history of occurring or no-show appointments.
The second model, using the appointment probabilities from the first model, estimated how many appointments might occur for every doctors’ shift.

The first model combined three streams of data to assess the no-show probability: appointment data (such as how long ago it was scheduled, type of appointment, etc.); patient information, especially past appointment history; and doctor information. I performed extensive feature selection to determine the best subset of variables to use and tested several types of machine learning models before settling on gradient boosting.

The second model used the probabilities in the first model as input data to predict how many patients to expect to come on each doctors’ shift. I settled on a neural network for the model.

Part 3: Building an App

Next, I worked with the software engineers on my team to develop an app to employ these models in real time and communicate the information to schedulers as they scheduled appointments. My ethnographic research was invaluable for developing how to construct the app.

On the back end, the app calculated the probability that all future appointments would occur, updating with new calculations for newly scheduled or edited appointments. Once a week, it would incorporate that week’s new appointment data and shift attendance to each model’s training data and update those models accordingly.

Through my ethnographic research, I observed how schedulers approached scheduling appointments, including what software they used in the process and how they used each. I used that to determine the best ways to communicate that information, periodically showing my ideas to the schedulers to make sure my strategy would be helpful.

I constructed an interface to communicate the information that would complement the current software they used. In addition to displaying the number of patients expected to arrive, if the machine learning algorithm was predicting that a particular shift was underbooked, it would mark the shift in green on the calendar interface; yellow if the shift was projected to have the ideal number of patients, and red if already expected have too many patients. The color-coding allowed easy visualization of the information in the moment: when trying to find an appointment time for a patient, they could easily look for the green shifts or yellow if they had to, but steer clear of the red. When zooming in on a specific shift, each appointment would be color-coded (likely, unlikely, and in the middle) as well based on the probability that it would occur.

Conclusion

This is one example of a projects that integrates data science and ethnography to build a machine learning app. I used ethnography to construct the app’s parameters and framework. It tethered the app in the needs of the schedulers, ensuring that the machine learning modeling I developed was useful to those who would use it. Frequent check-ins before each step in their development also helped confirm that my proposed concept would in fact help meet their needs.

My data science and machine learning expertise helped guide me in the ethnographic process as well. Being an expert in how machine learning worked and what sorts of questions it could answer allowed me to easily synthesize the insights from my ethnographic inquiries into buildable machine learning models. I understood what machine learning was capable (and not capable) of doing, and I could intuitively develop strategic ways to employ machine learning to address issues they were having.

Hence, my dual role as an ethnography and data scientist benefitted the project greatly. My listening skills from ethnography enabled me to uncover the underlying questions/issues schedulers faced, and my data science expertise gave me the technical skills to develop a viable machine learning solution. Without listening patiently through extensive ethnography, I would not have understood the problem sufficiently, but without my data science expertise, I would have been unable to decipher which questions(s) or issue(s) machine learning could realistically address and how.

This exemplifies why a joint expertise in data science and ethnography is invaluable in developing machine learning software. Two different individuals or teams could complete each separately – an ethnographer(s) analyze the users’ needs and a data scientist(s) then determine whether machine learning modeling could help. But this seems unnecessarily disjointed, potentially producing misunderstanding, confusion, and chaos. By adding an additional layer of people, it can easily lead to either the ethnographer(s) uncovering needs way too broad or complex for a machine learning-based solution to help or the data scientist(s) trying to impose their machine learning “solution” to a problem the users do not have.

Developing expertise in both makes it much easier to simultaneously understand the problems or questions in a particular context and build a doable data science solution.

Photo credit #1: DarkoStojanovic at https://pixabay.com/photos/medical-appointment-doctor-563427/

Photo credit #2: geralt at https://pixabay.com/illustrations/time-doctor-doctor-s-appointment-481445/

Photo credit #3: Pixabay at https://www.pexels.com/photo/light-road-red-yellow-46287/

How to Analyze Texts with Data Science

flat lay photography of an open book beside coffee mug

A friend and fellow professor, Dr. Eve Pinkser, asked me to give a guest lecture on quantitative text analysis techniques within data science for her Public Health Policy Research Methods class with the University of Illinois at Chicago on April 13^th, 2020. Multiple people have asked me similar questions about how to use data science to analyze texts quantitatively, so I figured I would post my presentation for anyone interested in learning more.

It provides a basic introduction of the different approaches so that you can determine which to explore in more detail. I have found that many people who are new to data science feel paralyzed when trying to navigate through the vast array of data science techniques out there and unsure where to start.

Many of her students needed to conduct quantitative textual analysis as part of their doctoral work but struggled in determining what type of quantitative research to employ. She asked me to come in and explain the various data science and machine learning-based textual analysis techniques, since this was out of her area of expertise. The goal of the presentation was to help the PhD students in the class think through the types of data science quantitative text analysis techniques that would be helpful for their doctoral research projects.

Hopefully, it would likewise allow you to determine the type or types of text analysis you might need so that you can then look those up in more detail. Textual analysis, as well as the wider field of natural language processing within which it is a part of, is a quickly up-and-coming subfield within data science doing important and groundbreaking work.

Download [707.13 KB]

Photo credit: fotografierende at https://www.pexels.com/photo/flat-lay-photography-of-an-open-book-beside-coffee-mug-3278768/

Three Situations When Ethnography Is Useful in a Professional Setting

This is a follow-up to my previous article, “What Is Ethnography,” outlining ways ethnography is useful in professional settings.

To recap, I defined ethnography as a research approach that seeks “to understand the lived experiences of a particular culture, setting, group, or other context by some combination of being with those in that context (also called participant-observation), interviewing or talking with them, and analyzing what is produced in that context.”

Ethnography is a powerful tool, developed by anthropologists and other social scientists over the course of several decades. Here are three types of situations in professional settings when I have found to use ethnography to be especially powerful:

1. To see the given product and/or people in action

2. When brainstorming about a design

3. To understand how people navigate complex, patchwork processes

Situation #1: To See the Given Product and/or People in Action

Ethnography allows you to witness people in action: using your product or service, engaging in the type of activity you are interested, or in whatever other situation you are interested in studying.

Many other social science research methods involve creating an artificial environment in which to observe how participants act or think in. Focus groups, for example, involve assembling potential customers or users into a room: forming a synthetic space to discuss the product or service in question, and in many experimental settings, researchers create a simulated environment to control for and analyze the variables or factors they are interested in.

Ethnography, on the other hand, centers around observing and understanding how people navigate real-world settings. Through it, you can get a sense for how people conduct the activity for which you are designing a product or service and/or how people actually use your product or service.

For example, if you want to understand how people use GPS apps to get around, one can see how people use the app “in the wild:” when rushing through heavy traffic to get to a meeting or while lost in the middle of who knows where. Instead of hearing their processed thoughts in a focus group setting or trying to simulate the environment, you can witness what the tumultuousness yourself and develop a sense for how to build a product that helps people in those exact situations.

Situation #2: When Brainstorming about a New Product Design

Ethnography is especially useful during the early stages of designing a product or service, or during a major redesign. Ethnography helps you scope out the needs of your potential customers and how they approach meeting said needs. Thus, it helps you determine how to build a product or service that addresses those needs in a way that would make sense for your users.

During such initial stages of product design, ethnography helps determine the questions you should be asking. Many have a tendency during these initial stages to construct designs based on their own perception of people’s needs and desires and miss what the customers’ or users’ do in fact need and desire. Through ethnography, you ground your strategy in the customers’ mindsets and experiences themselves.

The brainstorming stages of product development also require a lot of flexibility and adaptability: As one determines what the product or service should become, one must be open to multiple potential avenues. Ethnography is a powerful tool for navigating such ambiguity. It centers you on the users, their experiences and mindsets, and the context which they might use the product or service, providing tools to ask open-ended questions and to generate new and helpful ideas for what to build.

Situation #3: To Understand How People Navigate Complex, Patchwork Processes

At a past company, I analyzed how customer service representatives regularly used the various software systems when talking with customers. Over the years, the company had designed and bought various software programs, each to perform a set of functions and with unique abilities, limitations, and quirks. Overtime, this created a complex web of interlocking apps, databases, and interfaces, which customer service representatives had to navigate when performing their job of monitoring customer’s accounts. Other employees described the whole scene as the “Wild West:” each customer service representative had to create their own way to use these software systems while on the phone with a (in many cases disgruntled) customer.

Many companies end up building such patchwork systems – whether of software, of departments or teams, of physical infrastructure, or something else entirely – built by stacking several iterations of development overtime until, they become a hydra of complexity that employees must figure out how to navigate to get their work done.

Ethnography is a powerful tool for making sense of such processes. Instead of relying on official policies for how to conduct various actions and procedures, ethnography helps you understand and make sense of the unofficial and informal strategies people use to do what they need. Through this, you can get a sense for how the patchwork system really works. This is necessary for developing ways to improve or build open such patchwork processes.

In the customer service research project, my task was to develop strategies to improve the technology customer service representatives used as they talked with customers. Seeing how representatives used the software through ethnographic research helped me understand and focus the analysis on their day-to-day needs and struggles.

Conclusion

Ethnography is a powerful tool, and the business world and other professional settings have been increasingly realizing this (c.f. this and this ). I have provided three circumstances where I have personally found ethnography to be invaluable. Ethnography allows you to experience what is happening on the ground and through that to shape and inform the research questions we ask and recommendations or products we build for people in those contexts.

Photo credit #1: DariusSankowski at https://pixabay.com/photos/navigation-car-drive-road-gps-1048294/

Photo credit #2: AbsolutVision at https://unsplash.com/photos/82TpEld0_e4

Photo credit #3: Tony Wan at https://unsplash.com/photos/NSXmh14ccRU

Anthropologist in I.T. (Comic, Funny)

Here’s a fun little comic about some of my experiences working as an anthropologist in I.T. It’s actually a blast.

I wrote this comic for the University of Memphis Anthropology Department, where they featured it on their Fall 2018 newsletter.

Thank you, Rusty Haner, for illustrating the panels.

Data Science and the Myth of the “Math Person”

“Data science is doable,” a fellow attendee of the EPIC’s 2018 conference in Honolulu would exclaim like a mantra. The conference was for business ethnographers and UX researchers interested in understanding and integrating data science and machine learning into their research. She was specifically trying to address a tendency she has noticed– which I have seen as well: qualitative researchers and other so-called “non-math people” frequently believe that data science is far too technical for them. This seems ultimately rooted in cultural myths about math and math-related fields like computer science, engineering, and now data science, and in a similar vein as her statement, my goal in this essay is to discuss these attitudes and show that data science, like math, is relatable and doable if you treat it as such.

The “Math Person”

In the United States, many possess an implied image of a “math person:” a person supposedly naturally gifted at mathematics. And many who do not see themselves as fitting that image simply decry that math simply isn’t for them. The idea that some people are inherently able and unable to do math is false, however, and prevents people from trying to become good at the discipline, even if they might enjoy and/or excel at it.

Most skills in life, including mathematical skills, are like muscles: you do not innately possess or lack that skill, but rather your skill develops as you practice and refine that activity. Anybody can develop a skill if they practice it enough.

Scholars in anthropology, sociology, psychology, and education have documented how math is implicitly and explicitly portrayed as something some people can do and some cannot do, especially in math classes in grade school. Starting in early childhood, we implicitly and sometimes explicitly learn the idea that some people are naturally gifted at math but for others, math is simply not their thing. Some internalize that they are gifted at math and thus take the time to practice enough to develop and refine their mathematical skills; while others internalize that they cannot do math and thus their mathematical abilities become stagnant. But this is simply not true.

Anyone can learn and do math if he or she practices math and cultivates mathematical thinking. If you do not cultivate your math muscle, then well it will become underdeveloped and, then, yes, math becomes harder to do. Thus, as a cruel irony someone internalizing that he or she cannot do math can turn into a self-fulfilling prophecy: he or she gives up on developing mathematical skills, which leads to its further underdevelopment.

Similarly, we cultivate another false myth that people skilled in mathematics (or math-related fields like computer science, engineering, and data science) in general do not possess strong social and interpersonal communication skills. The root for this stereotype lies in how we think of mathematical and logical thinking than actual characteristics of mathematicians, computer scientists, or engineers. Social scientists who have studied the social skills of mathematicians, computer scientists, and engineers have found no discernable difference in social and interpersonal communication skills with the rest of the world.

Quantitative and Qualitative Specialties

Anyone can learn and do math if he or she practices math and cultivates mathematical thinking.

The belief that some people are just inherently good at math and that such people do not possess strong social and interpersonal communication skills contributes to the division between quantitative and qualitative social research, in both academic and professional contexts. These attitudes help cultivate the false idea that quantitative research and qualitative research are distinct skill sets for different types of people: that supposedly quantitative research can only be done “math people” and qualitative research by “people people.” They suddenly become separate specialties, even though social research by its very nature involves both. Such a split unnecessarily stifles authentic and holistic understanding of people and society.

In professional and business research contexts, both qualitative and quantitative researchers should work with each other and eventually through that process, slowly learn each other’s skills. If done well, this would incentivize researchers to cultivate both mathematical/quantitative, and interpersonal/qualitative research skills.

It would reward professional researchers who develop both skillsets and leverage them in their research, instead of encouraging researchers to specialize in one or the other. It could also encourage universities to require in-depth training of both to train their students to become future workers, instead of requiring that students choose among disciplines that promote one track over the other.

Working together is only the first step, however, whose success hinges on whether it ultimately leads to the integration of these supposedly separate skillsets. Frequently, when qualitative and quantitative research teams work together, they work mostly independently – qualitative researchers on the qualitative aspect of the project and quantitative researchers on the quantitative aspects of the project – thus reinforcing the supposed distinction between them. Instead, such collaboration should involve qualitative researchers developing quantitative research skills by practicing such methods and quantitative researchers similarly developing qualitative skills.

Conclusion

Anyone can develop mathematics and data science skills if they practice at it. The same goes with the interpersonal skills necessary for ethnographic and other qualitative research. Depicting them as separate specialties – even if they come together to do each of their specialized parts in a single research projects – functions stifles their integration as a singular set of tools for an individual and reinforces the false myths we have been teaching ourselves that data science is for math, programming, or engineering people and that ethnography is for “people people.” This separation stifles holistic and authentic social research, which inevitably involves qualitative and quantitative approaches.

Photo credit #1: Andrea Piacquadio at https://www.pexels.com/photo/woman-holding-books-3768126/

Photo credit #2: Antoine Dautry at https://unsplash.com/photos/_zsL306fDck

Photo credit #3: Mike Lawrence at https://www.flickr.com/photos/157270154@N05/28172146158/ and http://www.creditdebitpro.com/

Photo credit #4: Ryan Jacobson at https://unsplash.com/photos/rOYhgmDIOg8

Four Lessons in Time Management: What Graduate School Taught Me about Time Management

three round analog clocks and round gray mats

I am a Type-A personality who likes to do a variety of different activities yet cannot help but give each of them my all. Through this, I have learned a ton about time management. In particular, from 2017 to 2019, I was in graduate school at the University of Memphis while working as both a data scientist and a user researcher. I was easily working 70-90 hours a week.

Necessity is often the best teacher, and during this trial by fire, I figured out how to manage my time efficiently and effectively. Here are four personal lessons I learned for how to manage time well:

Lesson #1	Rest Effectively
Lesson #2	Work in Short-Term Sprints
Lesson #3	Complete Tasks during the Optimal Time of Day
Lesson #4	Rotating between Types of Tasks to Replenish Myself

Lesson #1: Rest Effectively

Developing an effective personal rhythm in which I had time to both work and relax throughout the day was necessary to ensure that I could work productively.

When many people think about time management (or at least when I do), they often focus on strategies/techniques to be productive during work time. Managing one’s time while working is definitely important, but I have found that resting and recuperating effectively is by far the most important single practice to cultivate to work productively.

Developing an effective personal rhythm in which I had time to both work and relax throughout the day was necessary to ensure that I could work effectively.

woman doing yoga meditation on brown parquet flooring

Several different activities help me relax: taking walks, exercising, hanging out with friends and colleagues, reading, watching videos, etc. People have a variety of ways to relax, so maybe some of those are great for you, and maybe you do something else entirely.

Generally, to relax I chose an activity that contrasted and complemented the work I had just been doing. For example, if my work was interviewing people – which I did frequently as a user researcher – then I would unwind with quiet, solitary tasks like walking or reading, but if my work was solitary like programming or writing a paper, I might unwind by socializing with others. Relaxing with a different type of activity as my work would allow me to rest and rejuvenate from the specific strains of that work activity.

I have seen a tendency in some of U.S. work/business culture to constantly push to do more. The goal is usually productivity – that is to get more done – and it makes sense to think that doing more will, well, lead to getting more things done.

That is true to a point, though, or at least to me. There comes a point when trying to do more actually prevents me from getting more done. Instead, taking enough time to rest and recuperate unwinds my mind so that when I am working, I am ready to go. This leads to greater productivity across all counts:

Quantitatively: I can complete a greater number of tasks
Qualitatively: The tasks I complete are of better quality
Efficiency: It takes me a lot less time to complete the same task

I think the idea that doing more work leads to greater productivity is a major false myth in the modern U.S. workforce. Instead, it leads to overwork, stress, and inefficiency, stifling genuine productivity.

Self-care through incorporating rest into my work rhythm has not only been necessary for my mental health but also to be a productive worker. In discussions around self-care, I have often a juxtaposition between being more productive and taking care of oneself, but those two concerns reinforce each other not contradict each other. Overworking without taking enough time to recuperate prevents me from being an effective and productive human worker. Instead, the question is how to cultivate life-giving and rejuvenating practices and disciplines so that I can become productive and maintain so.

Lesson #2: Work in Short-Term Sprints

I developed a practice of completing tasks in twenty-five-minute chunks. I would set the timer for twenty-five-minutes and work intensely without stopping on the given task/project until the time was up. (My technique has some similarities with the Pomodoro Technique, but without as many rules or requirements.) I realized that twenty-five-minutes was how long I could mentally work continuously on a single task without thinking about something else or needing a break. After that time, I would start to get tired and inefficient, so giving myself a break would let me unwind and rejuvenate.

After one of these twenty-five-minute sprints, I would take a break of at least five minutes: walk around, watch an interesting video, go talk with a colleague or friend, whatever I needed to do to unwind. These breaks were the time my brain would need to process what I was doing and reenergize for the next task. Given that my day would be made up of several of these twenty-five-minute sprints, for the first one or two, I might take a five minute break, but a few more, I might take a longer break as I had done more to unwind from.

A crucial skill for this practice has been successfully breaking down the given project to complete in the timed chunks. For some projects, I would designate a short-term task or goal to complete in the twenty-five-minutes. With my course readings, for example, I generally had to submit a summary and analysis of the readings. Thus, my goal during each twenty-five-minute sprint would be to finish one article or chapter – both reading it and writing the summary and analysis. I would start by reading the most significant subsections, generally the introduction and conclusion, summarizing and analyzing it as I read. That generally took up half of my twenty-five-minutes, so in whatever remaining time I had left, I would read the remaining sections.

This provided enough time to get a sense for the reading’s argument and complete the assignment, even in the off-chance that I did not have time to finish reading the entire article. In only twenty-five-minutes, I would knock out a whole reading, including my summary and analysis: one less task to worry about. Spending twenty-five-minutes a day is not that much of a burden either. Doing this, I would complete all the readings for my courses within the first few weeks of the semester, opening time over the next several months when my other work would pick up.

I could not split all activities into short-term tasks to complete in twenty-five-minutes, though. For those I could not, the trick was to estimate how much time an overall task would take. For example, if my supervisor gave me a month to complete a project, I would then calculate how many twenty-five-minute slots I would need per day given how many total hours I would likely need to spend on the project.

Data science projects are notoriously nonlinear, meaning that I could just about never break them down into sets of twenty-five-minute tasks, but rather almost always had to just figure out how much total time to budget like this. The various parts of a data science project – like data cleaning, building the model(s), and then improving/refining said model – could take widely different amount of times to complete and often fed into each other anyways. The first data science projects were always the hardest to determine how long they would take, but after doing many of them, I developed an intuitive sense of how much time to budget.

toddler's standing in front of beige concrete stair

The fear of a blank page and resulting procrastination were major issues I had to overcome when working on a project. At the beginning of the project, before I had broken down the task and determined the best strategy for how to complete it, focusing could be difficult. If I was not careful, the stress of the blank page or complete openness of the new project could cause me to become distracted and want to do something else instead. In more extreme cases, this could lead to procrastinating in getting started at all.

To get my ideas on paper, during the first twenty-five-minute sprint of a new task, I would look through all my materials and brainstorm how I would complete the task. Through this, I would develop an initial to do list of items that I could do in the ensuing sprints. Even though my to do list almost always changed overtime, this allowed me to get started. The most important caveat was to make sure I did that planning session when I was able to handle such an open-ended task (something I discuss in more detail in Lesson #3).

I also addressed my tendency to procrastinate by creating my own stricter deadlines for when a project was due. Extreme procrastination (like putting off starting or completing something until the last minute when you must rush to complete a task in the last several hours before its deadline) would destroy my productivity. Having to work in a mad rush would prevent me from having the balance between work and rest I discussed in Lesson #1 necessary to work productively. And when I have a lot of tasks, rushing last minute for one project would prevent me from working ahead on future projects, which would have then caused me to fall behind on them and create a vicious cycle of procrastination.

Thus, I would set my own deadline a week or two prior to a project’s actual deadline. For example, if I had four weeks to write an assignment, I would set my own deadline of three weeks for a presentable draft, and no matter what, I would meet this deadline. I would treat this like my actual deadline and never missed it. This presentable draft may not be perfect or amazing yet but something that in a pinch I would feel comfortable turning in: a solid B or B- quality version, not the A or A+ awesomeness my perfectionist self prefers. I might need to proofread once or twice to smooth out some kinks, but it has all the basic components of the task or assignment done. That way, if I became too busy with other projects to do that proofreading, it was good enough quality that I could still turn it in without editing in a pinch.

In the remaining week, I would then work out those minor issues, combing it a few more times to make it top quality, but if another, higher priority project or issue arose during that final week needing more of my attention than I anticipated, I could still have something to turn in. By making sure I stayed ahead with an adequate draft, I never had to worry about falling behind and rushing to finish as assignment last minute, and being a week or so ahead provided a cushion or shock absorber to handling any unforeseeable issues without falling behind. Through this, I never missed a single deadline despite working multiple jobs and being a full-time student.

Lesson #3: Complete Tasks during the Optimal Time of Day

I have found that certain types of activities are easier for me during certain times of the day. For example, being a morning person, I do my best work first thing in the morning. Thus, I would perform my most open-ended, creative, and strategic types of tasks – like brainstorming and breaking down a new project, solving an open-ended problem, and writing an essay or report – then. In the early afternoon, I would try to schedule any meetings and interviews (if that worked in the other people’s schedules as well of course), and in the late afternoon and evening, I would complete more menial, plug-and-chug aspects of a project that need less intense mental thought and more rote implementation of what I came up with that morning, like writing the code of an algorithm I had mapped out in the morning or proofreading a paper I already wrote. This would ensure that I would be fresh and efficient when doing the complex, open-ended tasks and not wasting my time and energy trying to force myself to complete such tasks during the times of the day when I am naturally tired, slower, and less efficient.

Lesson #4: Leveraging Different Types of Tasks to Replenish Myself

As both a data scientist and anthropologist, I have had to do a wide variety of tasks, using many different skills, ranging from talking and interviewing people to math proofs and programming to scholarly and non-fiction writing. This variety has been something I could use to replenish myself. Each of these activities is in of itself stimulating to me, but doing one of them exclusively for long periods of time would become draining after a while.

In agriculture, certain crops use up certain nutrients in the soil (like corn depletes nitrogen particularly strongly), so farmers will often rotate between crops to replenish the nutrients in the soil from the previous crop. Likewise, I found rotating between several different types of activities helpful for rejuvenating and replenishing my mind from the last activity.

If I had to do a series of very logical tasks like math or programming, I might replenish with a social task as my next activity like interviewing or meeting with people, or if I interviewed people for several hours, I would next break from that by doing something solitary like programming or writing. I would use these rotations strategically to rest from one activity while still practicing and developing other skill sets.

Conclusion

These are the lessons I learned for how to sustain myself while working 80-100-hour weeks. The first lesson was crucial: developing an effective rhythm between work and rest that enabled me to work productively, efficiently, and sustainably. The other three were my specific strategies for how I created that rhythm. I developed and refined them during intense, busy periods of my life in order to still produce high quality work while maintaining my sanity. Hopefully, they are helpful food for thought for anyone else trying to develop his or her own time-management strategies.

Photo credit #1: Karim MANJRA at https://unsplash.com/photos/dtSCKE9-8cI

Photo credit #2: Jared Rice at https://unsplash.com/photos/NTyBbu66_SI

Photo credit #3: Carl Heyerdahl at https://unsplash.com/photos/KE0nC8-58MQ

Photo credit #4: Allie Smith at https://unsplash.com/photos/eXGSBBczTAY

Photo credit #5: NeONBRAND at https://unsplash.com/photos/KYxXMTpTzek

Photo credit #6: Alex Siale at https://unsplash.com/photos/qH36EgNjPJY

Photo credit #7: Jukan Tateisi at https://unsplash.com/photos/bJhT_8nbUA0

Photo credit #8: Ksenia Makagonova at https://unsplash.com/photos/Vq-EUXyIVY4

Photo credit #9: Dawid Zawila at https://unsplash.com/photos/-G3rw6Y02D0

Photo credit #10: Dennis Jarvis at https://www.flickr.com/photos/archer10/3555040506/

Methodological Complementarianism: Being the Mix in Mixed Methods

photo of women at the meeting — Photo by RF._.studio on Pexels.com

I wrote this essay for my midterm for a course I took on conducting program evaluation as an anthropologist taught by Dr. Michael Duke at the University of Memphis Anthropology Master’s program. In it, I synthesize Donna Mertens’s discussion of employing mixed methods research for program evaluation work in her book, Mixed Methods Design in Evaluation, as a way to present the need for what I call methodological complementarianism.

Methodological complementarianism involves complementing those on the team one is working with by advancing for the complementary perspectives that the team needs. When conducting transdisciplinary work as applied anthropologists, instead of explicitly or implicitly seeking to maintain a “pure” anthropological approach, I think we should have a greater willingness to produce something anew in that environment, even if it no longer fits the “pure” boundaries of proper anthropology or ethnography but rather some kind of hybrid emerging out of the needs of the situation. Methodological complementarianism is one practical way to do that I have been exploring.

Download [29.13 KB]

What Is Data Science and Machine Learning? A Short Guide for the Unsure

What is data science, and what is machine learning? This is a short overview for someone who has never heard of either.

What Is Data Science?

In the abstract, data science is an interdisciplinary field that seeks to use algorithms to organize, process, and analyze data. It represents a shift towards using computer programing, specifically machine learning algorithms, and other, related computational tools to process and analyze data.

By 2008, companies starting using the term data scientists to refer to a growing group of professionals utilizing advanced computing to organize and analyze large datasets,[i] and thus from the get-go, the practical needs of professional contexts have shaped the field. Data science combines strands from computer science, mathematics (particularly statistics and linear algebra), engineering, the social sciences, and several other fields to address specific real-world data problems.

On a practical level, I consider a data scientist someone who helps develop machine learning algorithms to analyze data. Machine learning algorithms form the central techniques/tools around what constitutes data science. For me personally, if it does not involve machine learning, it is not data science.

What Is Machine Learning?

Machine learning is a complex term: What to say that a machine “learns”? Overtime data scientists have provided many intricate definitions of machine learning, but its most basic, machine learning algorithms are algorithms that adapt/modify how their approach to a task based on new data/information overtime.

Herbert Simon provides a commonly used technical definition: “Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the task or tasks drawn from the same population more efficiently and more effectively the next time.”[ii] As this definition implies, machine learning algorithms adapt by iteratively testing its performance against the same or similar data. Data scientists (and others) have developed several types of machine learning algorithms, including decision tree modeling, neural networks, logistic regression, collaborative filtering, support vector machines, cluster analysis, and reinforcement learning among others.

Data scientists generally split machine learning algorithms into two categories: supervised and unsupervised learning. Both involve training the algorithm to complete a given task but differ on how they test the algorithm’s performance. In supervised learning, the developer(s) provide a clear set of answers as a basis for whether the prediction is correct; while for unsupervised learning, whether the algorithm’s performance is much more open-ended. I liken the difference to be like the exams teachers gave us in school: some tests, like multiple choice exams, have clear, right and wrong answers or solutions, but other exams, like essays, are open-ended with qualitative means of determining goodness. Just like the nature of the curriculum determines the best type of exam, which type of learning to performs depends on the project context and nature of the data.

Here are four instances where machine learning algorithms are useful in these types of tasks:

Autonomy: To teach computers to do a task without the direct aid/intervention of humans (e.g. autonomous vehicles)
Fluctuation: Help machines adjust when the requirements or data change over time
Intuitive Processing: Conduct (or assist in) tasks humans do naturally but are unable to explain how computationally/algorithmically (e.g. image recognition)
Big Data: Breaking down data that is too large to handle otherwise

Machine learning algorithms have proven to be a very powerful set of tools. See this article for a more detailed discussion of when machine learning is useful.

[i] Berkeley School of Information. (2019). What is Data Science? Retrieved from https://datascience.berkeley.edu/about/what-is-data-science/.

[ii] Simon in Kononenko, I., & Kukar, M. (2007). Machine Learning and Data Mining. Elsevier: Philadelphia.

Photo credit #1: Frank V at https://unsplash.com/photos/zbLW0FG8XU8

Photo credit #2: Brett Jordan at https://unsplash.com/photos/HzOclMmYryc

Recently Published Article: “Anthropology by Data Science”

tea set and newspaper placed on round table near comfortable chair — Photo by Ekrulila on Pexels.com

I am pleased to announce that the Annals of Anthropological Practice has accepted my article “Anthropology by Data Science.” https://anthrosource.onlinelibrary.wiley.com/doi/10.1111/napa.12169. In it, I reflect on the relationship anthropologist have cultivated with data science as a discipline and the importance of integrating machine learning techniques into ethnographic practice.

Annals of Anthropological Practice is overseen by the National Association for the Practice of Anthropology (NAPA) within the American Anthropological Association. Thank you, NAPA, for publishing my article and thank you to all the unnamed editors and reviewers in the process.

Interdisciplinary Anthropology and Data Science Master’s Thesis: A Quick and Dirty Project Summary

This is a quick and dirty summary of my master’s practicum research project with Indicia Consulting over the summer of 2018. For anyone interested in more detail, here is a more detailed report, and here is the final report with Indicia.

Background

My practicum was the sixth stage of a several year-long research project. The California Energy Commission commissioned this larger project to understand the potential relationship between individual energy consumption and technology usage. In stages one through five, we isolated certain clusters of behavior and attitudes around new technology adoption – which Indicia called cybersensitivity – and demonstrated that cybersensitivity tended to associate with a willingness to adopt energy-saving technology like smart meters.

This led to a key question: How can one identify cybersensivity among a broader population such as a community, county, or state? Answering this question was the main goal of my practicum project.

In the past stages of the research project, the team used ethnographic research to establish criteria for whether someone was a cybersensitive based on several hours of interviews and observations about their technology usage. These interviews and observations certainly helped the research team analyze behavioral and attitudinal patterns, determine what patterns were significant, and develop those into the concept of cybersensitivity, but they are too time- and resource-intensive to perform with an entire population. One generally does not have the ability to interview everyone in a community, county, or state. I sought to address this directly in my project.

Task	Timeline	Task Name	Research Technique	Description
Task 1	June 2015-Sept 2018	General Project Tasks	Administrative (N/A)	Developed project scope and timeline, adjusting as the project unfolds
Task 2	July 2015 – July 2016	Documenting and analyzing emerging attitudes, emotions, experiences, habits, and practices around technology adoption	Survey	Conducted survey research to observe patterns of attitudes and behaviors among cybersensitives/awares.
Task 3	Sept 2016 – Dec 2016	Identifying the attributes and characteristics and psychological drivers of cybersensitives	Interviews and Participant-Observation	Conducted in-depth interviews and observations coding for psych factor, energy consumption attitudes and behaviors, and technological device purchasing/usage.
Task 4*	Sept 2016 – July 2017	Assessing cybersensitives’ valence with technology	Statistical Analysis	Tested for statistically significant differences in demographics, behaviors, and beliefs/attitudes between cyber status groups
Task 5	Aug 2017 – Dec 2018	Developing critical insights for supporting residential engagement in energy efficient behaviors	Statistical Analysis	Analyzed utility data patterns of study participants, comparing it with the general population.
Task 6	March 2018 – Aug 2018	Recommending an alternative energy efficiency potential model	Decision Tree Modeling	Constructed decision tree models to classify an individual’s cyber status

Project Goal

The overall goal for the project was to produce a scalable method to assess whether someone exhibits cybersensitivity based on data measurable across an entire population. In doing this, the project also helped address the following research needs:

Created a method to further to scale across a larger population, assessing whether cybersensitives were more willing to adopt energy saving technologies across a community, county, or state
Provided the infrastructure to determine how much promoting energy-saving campaigns targeting cybersensitives specifically would reduce energy consumption in California
Helped the California Energy Commission determine the best means to reach cybersensitives for specific energy-saving campaigns

The Project

I used machine learning modeling to create a decision-making flow to isolate cybersensitives in a population. Random forests and decision trees produced the best models for Indicia’s needs: random forests in accuracy and robustness and decision trees in human decipherability. Through them, I created a programmable yet human-comprehensible framework to determine whether an individual is cybersensitive based on behaviors and other characteristics that an organization could be easily assess within a whole population. Thus, any energy organization could easily understand, replicate, and further develop the model since it was both easy for humans to read and encodable computationally. This way organizations could both use and refine it for their purposes.

Conclusion

This is a quick overview of my master’s practicum project. For more details on what modeling I did, how I did it, what results it produced, and how it fit within the wider needs of the multi-year research project, please see my full report.

I really appreciated the opportunity it posed to get my hands dirty integrating ethnography and data science to help address a real-world problem. This summary only scratches the surface of what Indicia did with the Californian Energy Commission to encourage sustainable energy usage societally. Hopefully, though, it will inspire you to integrate ethnography and data science to address whatever complex questions you face. It certainly did for me.

Thank you to Susan Mazur-Stommen and Haley Gilbert for your help in organizing and completing the project. I would like to thank my professorial committee at the University of Memphis – Dr. Keri Brondo, Dr. Ted Maclin, Dr. Deepak Venugopal, and Dr. Katherine Hicks – for their academic support as well.

Part 1: Scoping out the Project

Part 2: Building the Model

Part 3: Building an App

Conclusion

Share this:

Share this:

Situation #1: To See the Given Product and/or People in Action

Situation #2: When Brainstorming about a New Product Design

Situation #3: To Understand How People Navigate Complex, Patchwork Processes

Conclusion

Share this:

Share this:

The “Math Person”

Quantitative and Qualitative Specialties

Conclusion

Share this:

Lesson #1: Rest Effectively

Lesson #2: Work in Short-Term Sprints

Lesson #3: Complete Tasks during the Optimal Time of Day

Lesson #4: Leveraging Different Types of Tasks to Replenish Myself

Conclusion

Share this:

Share this:

What Is Data Science?

What Is Machine Learning?

Share this:

Share this:

Background

Project Goal

The Project

Conclusion

Share this: