Although my background is in machine learning, much of my recent work has been in the emergent field of computational social science. At a high level, computational social science is the study of social phenomena – i.e., is phenomena that occur in social processes – using digitized information and computational and statistical methods. These social processes might be anything from in-person interactions between friends all the way up to the activities of government organizations or multinational corporations. Despite this variety, social processes possess three commonalities: First, they all have structure – i.e., who is interacting with whom. Second, they all have content – i.e., information used in or arising from these interactions. And, third, most social processes exhibit dynamics – i.e., their structure of content can change over time. My research, unlike traditional social network analysis, natural language processing, or time series analysis, therefore focuses on all three of these attributes simultaneously.
Computational social science sits at the intersection of computer science, statistics, and the social sciences. It's an inherently interdisciplinary field, with researchers from very different disciplines working together to address some common goal. In general, computer scientists care most about predictive goals – i.e., using observed data to make predictions about missing information or future, yet-to-be observed data – while social scientists care most about explanatory goals – i.e., finding probable explanations for observed data. As a computational social scientist, I focus on both predictive and explanatory goals, as well as exploration – i.e., uncovering patterns in observed data, usually patterns that we cannot directly observe. Exploratory analyses can identify and measure latent structure – e.g., topics in documents or communities in networks – which can then be used to achieve some predictive or explanatory goal.
In collaboration with political scientists, sociologists, and journalists, my research centers around developing new machine learning models that simultaneously capture the structure, content, and dynamics of social processes, while facilitating prediction, explanation, and exploration. My collaborators and I use these models to study social phenomena by analyzing publicly available data, such as email networks, document collections, press releases, meeting transcripts, event databases, and news articles. Some of my recent projects include using email networks to study the role of gender in local government, using public records requests to identify peer conformity effects in government transparency, using event databases to uncover the latent structure of international relations, and using declassified government documents to learn about world history and the US transparency process. My research contributes to machine learning, Bayesian statistics, natural language processing, public administration, political science, sociology, and journalism, as well as computational social science.
Finally, studying social phenomena using machine learning raises new ethical questions, many of which are centered around the ideas of fairness, accountability, and transparency. I therefore collaborate with researchers in public policy, science and technology studies, law, and machine learning to explore these questions.