University of Massachusetts Amherst

Search Google Appliance




Data Science: Impacting Infrastructure Investment Decisions

Assistant Professor Daniel Sheldon

I’m currently engaged in a research project sponsored by the Massachusetts Department of Transportation to assess the vulnerability of road networks in Western Massachusetts to disruptions caused by extreme weather events. This is part of a larger interdisciplinary project at UMass that is building holistic models of road and river networks—including factors such as water flow, infrastructure condition, and ecological connectivity—to provide information to help improve the resilience of road networks and the health of river networks. Our pilot project analyzes the 665 square mile portion of the Deerfield River watershed in Western Massachusetts to model the extent to which emergency services will be disrupted if flooding renders key road segments unpassable. We are developing optimization algorithms to select infrastructure maintenance and upgrade projects to minimize this disruption in the future.


Road-stream crossings are a growing area of concern for the health of road and river networks. In particular, culverts are ubiquitous and aging infrastructure elements, which, if they do not function properly, disrupt ecological connectivity in rivers and streams and risk failing and making roads impassable during floods.


Road failures during extreme weather events cause major disruptions to commerce and public services such as emergency medical services (EMS). Investing to improve infrastructure prior to a disaster is much more cost-effective than responding in a disaster relief scenario. The state Department of Transportation (DOT) sponsored a project to address the problem, which had several goals:

  • develop an innovative systems-based approach to improve the assessment prioritization, planning, protection and maintenance of roads and road-stream crossings
  • proactively address upgrading structures to account for climate change
  • complement existing DOT project development and bridge design processes
  • provide a decision-support tool for project planning and development.

As part of this project, we addressed a key question: what is the best way to invest infrastructure dollars prior to a disaster?


We are addressing the problem in two phases. First, we assessed the potential of each road-stream crossing in the network to disrupt road network functionality—as measured by EMS response times—if it fails. Key tasks included: mapping “first responder” services (e.g., police and fire stations, hospitals), collecting and analyzing historical EMS data, using scenario analysis to “replay” old EMS incidents in different versions of the network after simulated failures, ranking of road-stream crossings and road segments based on disruption potential, and map-based visualization of results. The results of this phase of the project provide information about the disruption potential of individual culverts to MassDOT to guide maintenance and infrastructure planning.


Second, we are developing efficient optimization algorithms to help invest maintenance dollars to optimize the resilience of road networks. For this part of the project, each culvert is assigned a probability of failure, which will be based on modeling of the condition and geomorphic vulnerability done in other parts of the broader project. For each culvert, maintenance and repair actions can be performed to reduce the failure risk. The goal is to select which maintenance and repair actions to perform, given a fixed budget, to minimize the expected EMS response times after a flooding event.


We created a fast algorithm for this stochastic optimization problem by designing a novel sampling technique and a novel primal-dual procedure. Our method performs nearly optimally in benchmarks and is much more scalable than existing algorithms. These tools will influence decision making relative to road maintenance and lead to improved access to emergency medical services during natural disasters. We will soon be speaking with the DOT and the Massachusetts Emergency Management Agency to discuss next phases.


Our analytical platform can be applied more broadly to optimize the resilience of networks, including communication networks, social networks, financial networks, and habitat networks. Access more information about Daniel Sheldon’s research here.

Social Processes

Hanna Wallach, Adjunct Associate Professor, UMass Amherst & Senior Researcher, Microsoft Research NYC

Although my background is in machine learning, much of my recent work has been in the emergent field of computational social science. At a high level, computational social science is the study of social phenomena – i.e., is phenomena that occur in social processes – using digitized information and computational and statistical methods. These social processes might be anything from in-person interactions between friends all the way up to the activities of government organizations or multinational corporations. Despite this variety, social processes possess three commonalities: First, they all have structure – i.e., who is interacting with whom. Second, they all have content – i.e., information used in or arising from these interactions. And, third, most social processes exhibit dynamics – i.e., their structure of content can change over time. My research, unlike traditional social network analysis, natural language processing, or time series analysis, therefore focuses on all three of these attributes simultaneously.


Computational social science sits at the intersection of computer science, statistics, and the social sciences. It's an inherently interdisciplinary field, with researchers from very different disciplines working together to address some common goal. In general, computer scientists care most about predictive goals – i.e., using observed data to make predictions about missing information or future, yet-to-be observed data – while social scientists care most about explanatory goals – i.e., finding probable explanations for observed data. As a computational social scientist, I focus on both predictive and explanatory goals, as well as exploration – i.e., uncovering patterns in observed data, usually patterns that we cannot directly observe. Exploratory analyses can identify and measure latent structure – e.g., topics in documents or communities in networks – which can then be used to achieve some predictive or explanatory goal.


In collaboration with political scientists, sociologists, and journalists, my research centers around developing new machine learning models that simultaneously capture the structure, content, and dynamics of social processes, while facilitating prediction, explanation, and exploration. My collaborators and I use these models to study social phenomena by analyzing publicly available data, such as email networks, document collections, press releases, meeting transcripts, event databases, and news articles. Some of my recent projects include using email networks to study the role of gender in local government, using public records requests to identify peer conformity effects in government transparency, using event databases to uncover the latent structure of international relations, and using declassified government documents to learn about world history and the US transparency process. My research contributes to machine learning, Bayesian statistics, natural language processing, public administration, political science, sociology, and journalism, as well as computational social science.


Finally, studying social phenomena using machine learning raises new ethical questions, many of which are centered around the ideas of fairness, accountability, and transparency. I therefore collaborate with researchers in public policy, science and technology studies, law, and machine learning to explore these questions.


Additional information about Hanna and her research is accessible here and here.


Data Science: Impacting Energy Decisions

Professor Prashant Shenoy

I collaborate with my colleague David Irwin on a research project with a municipal utility to help their customers become more efficient energy consumers and save money.

Working in a pilot program with Holyoke Gas & Electric, a Massachusetts utility company and funded by the NSF and the Massachusetts Department of Energy Resources, we are analyzing electricity usage in the homes of more than 18,000 customers recorded by smart meters. Meters report electronically every five minutes "occupancy information" - data which shows when lights and appliances are turned on.


Occupancy information is matched with thermostat programming leading to potential energy savings of between 5 - 10 percent and millions of dollars or annual savings for customers.

 Meter data lets researchers see what appliances, lights and heating and cooling equipment is being used during the day. Data are analyzed using resources at the nearby Massachusetts Green High Performance Computing Center run by a university and industry partnership and offering high-speed data analysis on 16,000 computers. To preserve individual customer privacy, meter data is aggregated and shared in anonymous format.


The utility company is looking forward to analyzing the research findings to help customers reduce energy consumption and to identify opportunities to lower peak demand on the utility’s system to reduce customer costs.


The project has gone quite well and the data analysis looks promising. Two doctoral degree students are engaged in the project and will use the experience for their thesis research.


More information about my research is accessible here