Integrating Ethnography and Data Science

As a data scientist and ethnographer, I have worked on many types of research projects. In professional and business settings, I am excited by the enormous growth in both data science and ethnography but have been frustrated by how, despite recent developments that make them more similar, their respective teams seem to be growing apart and competitively against each other.

Within academia, quantitative and qualitative research methods have developed historically as distinct and competing approaches as if one has to choose which direction to take when doing research: departments or individual researchers specialize in one or the other and fight over scarce research funding. One major justification for this division has been the perception that quantitative approaches tend to be prescriptive and top-down compared with qualitative approaches which tend to be to descriptive and bottom-up. That many professional research contexts have inherited this division is unfortunate.

Recent developments in data science draw parallels with qualitative research and if anything, could be a starting point for collaborative intermingling. What has developed as “traditional” statistics taught in introductory statistics courses is generally top-down, assuming that data follows a prescribed, ideal model and asking regimented questions based on that ideal model. Within the development of machine learning been a shift towards models uniquely tailored to the data and context in question, developed and refined iteratively.[i] These trends may show signs of breaking down the top-down nature of traditional statistics work.

If there was ever a time to integrate quantitative data science and qualitative ethnographic research, it is now. In the increasingly important “data economy,” understanding users/consumers is vital to developing strategic business practices. In the business world, both socially-oriented data scientists and ethnographers are experts in understanding users/consumers, but separating them into competing groups only prevents true synthesis of their insights. Integrating the two should not just include combining the respective research teams and their projects but also encouraging researchers to develop expertise in both instead of simply specializing in one or the other. New creative energy could burst forth when we no longer treat these as distinct methodologies or specialties.


[i] Nafus, D., & Knox, H. (2018). Ethnography for a Data-Saturated World. Manchester: Manchester University Press, 11-12.

Photo credit #1: Frank V at  https://unsplash.com/photos/IFLgWYlT2fI

Photo credit #2: Arif Wahid at https://unsplash.com/photos/y3FkHW1cyBE

Evaluating the Effectiveness of Part of Speech Augmentation in Next Word Predictors

The following was a project I completed for a graduate course in Artificial Intelligence I took at the University of Memphis in the spring of 2019. For the project, I analyzed whether part of speech evaluation could modulate Markov Chain-based next word predictors. In particular, I developed and tested two different strategies for incorporating part of speech predictions, which I termed excluder and multiplier. The multiplier method performed better than the excluder and matched the performance of the control. Hopefully, this is a helpful exploration into ways to use lexical information to improve next word predictors.

Loader Loading…
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab

Download

Photo Credit: Brett Jordan from https://unsplash.com/photos/EvJ7uvqQb3E