AMHERST — These students already know the Pioneer Valley Transit Authority buses pretty well. They know certain buses tend to run a few minutes behind. They know that on weekend evenings, leaving downtown Amherst, they’ll have to squeeze.
And sometimes, if traveling from one spot on the University of Massachusetts campus to another, it can be faster to walk.
These are patterns they’ve gleaned from regular ridership, from taking the bus to class in the morning and home after long nights of studying. But this weekend they got to know the PVTA in a new way — through data.
In an event organized by the UMass Graduate Researchers interested in Data club, or GRiD, students from the Five Colleges tried to make sense of hundreds of thousands of rows of numbers, asking and attempting to answer questions about how the PVTA operates and whether there are ways it can improve.
On Saturday afternoon, around 20 students spread out with their laptops in a lounge on the 16th floor of the Lederle Graduate Research Tower, plotting data and writing code. These were some of the 89 students who signed up for the hackathon, a term for an increasingly popular event in the software world, where computer programmers and others come together to collaborate on a problem.
The students began at 6 p.m. Friday, some splitting into teams and others working individually. They had until noon Sunday to come up with a project related to the data, at which time they would have the opportunity to present their findings to a panel of judges.
“We all take the bus, we all move around during the school day,” said GRiD member Ankita Shankhdhar, 24, who’s pursuing a master’s degree in applied mathematics at UMass. “How can we make that experience better?”
Students said PVTA has already done a lot to improve service, citing the mobile app and display screens with real-time bus information that are scattered across campus as positive strides. Largely, they said, the buses tend to be reliable. But some said it would be nice to predict how late a certain bus will be at a given time of day, so they could better plan. Others said they wanted to know when buses were most crowded.
GRiD co-chair Mark Hagemann, 28, of Hurley, Wisconsin, coordinated with Josh Rickman, PVTA manager operations and planning, who agreed to share the data. Using GPS, time and ridership data from January through December 2015, students used various software programs and coding languages to see if they could answer questions about the way the buses work.
UMass graduate students Pradeep Ambati and Ravi Choudhary were using the computer language Python to track historical delays at different stops. Ideally, their product would be able to tell students rushing to get to class whether they should stay on the bus or hop off and walk, said Choudhary, 27, of Rajasthan, India.
As someone whose research focuses on machine learning, Ambati, 25, of Hyderabad, India, said he was interested in exploring “what kind of answers you can get from the data.”
Other students echoed this interest.
Betsy Camano, 23, a UMass graduate student from New York, said she was eager to play around with such a large data set, something that went way beyond the scope of what she had seen in school. One of her classes uses a sample “Titanic” data set, with the names of passengers and fares people paid, but that seemed small and airbrushed compared to the PVTA data, she said.
“In classes it’s always the perfect data,” said Hampshire College senior Eddie Pantridge, 21, of Needham, noting that a real data set comes with “insane obstacles you wouldn’t have expected.”
Not only was the data set large and unruly to deal with, sometimes crashing various computer programs, but there were several outliers, like the number that suggested a bus was traveling 90 mph in downtown Amherst, or the coordinates that put a bus in South America, he said.
Pantridge was trying to find a way to display all the buses on a map. And suddenly he had a breakthrough — high-fiving the student sitting next to him.
“There’s something really beautiful about how you can take a bunch of numbers in a table and make it so anyone can see what they mean,” said 19-year-old Brooke Fitzgerald, of Twin Falls, Idaho, a sophomore at Hampshire studying applied mathematics and statistics.
Steven Kalt, 21, of Shutesbury, who’s studying environmental studies and French at Amherst College, had a more practical reason for attending: “I’d like to get a job someday.”
As hackathons have become increasingly popular, he said, they’ve become an important way to build a resume.
Tom Jeon, 21, of Lexington, who’s studying data science at UMass, said he thinks the hype is well-deserved. He cited the recent film “The Big Short,” which chronicles how data were used to predict the financial crisis of 2008.
“Data is the key to learning whatever you want to learn,” he said. “You can predict the future.”
That future could be how the housing bubble bursts or what time the bus arrives — it all depends on what data you have and how you read it.
By STEPHANIE McFEETERS Daily Hampshire Gazette
Stephanie McFeeters can be reached at email@example.com.