Community nonprofits often collect or have access to large amounts of data, but lack the staffing, resources, and expertise to analyze and make sense of it. The Data Science WAV (Wrangle, Analyze, Visualize) project aims to address this issue. DSC-WAV is a National Science Foundation-funded project with two goals: to address the inability of community-based and nonprofit organizations to tackle data science problems, and to give real-world experience to undergraduate students studying data science. In the Spring 2020 semester, undergraduate students from Amherst College, Hampshire College, Mt. Holyoke College, Smith College, and UMass Amherst worked in teams on projects with five community partners: Western Mass Health Equity Network (WMHEN), Girls Inc. of the Pioneer Valley, VentureWell, and The Nature Conservancy.
Jaime Dávila, senior teaching faculty at the UMass Amherst College of Information and Computer Sciences, was the principal investigator for the WMHEN team. The WMHEN is an organization residing in the UMass School of Public Health and Health Sciences, and “seeks regional strategies and opportunities to create conditions in which communities are able to attain the highest level of health for all residents.” Working closely with WMHEN staff, Davila and team focused their work on health disparities between western and eastern Massachusetts. Using a combination of Centers for Disease Control and County Health Rankings data, the students were able to analyze mortality rates and socio-demographic variables within each county in Massachusetts.
For Risa Silverman, WMHEN director, working with the team was an important start to learning about each other’s disciplines, which will facilitate future cross-discipline collaborations. “It was valuable to connect our community public health work with data science students and spend time figuring out what we could work on together that would benefit us all,” said Silverman.
The team proposed three questions for further research:
How can WMHEN get access to confidential CDC data that isn’t released to the rest of the public?
What are the racial disparities in health characteristics across the state, and how do they compare by region?
How can analyses like these containing binary data affect non-binary communities?
While DSC-WAV serves undergraduate students, the Center for Data Science’s Data Science for the Common GoodTM (DS4CG) program employs graduate students for a similar mission: to train data science students to work on real-world problems that benefit the common good. DS4CG is a summer program that currently has projects running with three organizations: the Appalachian Mountain Club, AuCode, the Veterans Administration, and the UMass Classics department. A report on the accomplishments of these teams will be published in the fall.