AMHERST, Mass. – One of the newest, most challenging problems facing experts in artificial intelligence (AI) and machine learning is called “explainability,” says David Jensen at the University of Massachusetts Amherst. However, it’s a problem he is delighted to have because it means that machine learning systems can now learn such complex models of the world that the scientists who wrote them have a tough time understanding what the machines have learned.
He says, “We built them, now we have to figure out what they learn. With the enormous expansion of deep neural networks recently and the availability of huge, detail-rich data sets, there has been an explosion of capability in machine learning and AI. But at the same time, the workings inside the ‘black box’ – what these systems learn and how they make decisions – can be incredibly opaque. So ‘explainability’ has become absolutely vital.”
“There is a profound scientific goal here of being able to explain how deep neural networks function,” he adds. “Right now they are learning things we did not know they were going to learn.”
To support this work, Jensen, a professor in UMass Amherst’s College of Information and Computer Sciences, was recently awarded a four-year, $1.4 million grant by the Defense Advanced Research Projects Agency (DARPA) as a subcontractor on a project led by Charles River Analytics, Inc. of Boston.
The researchers state in their proposal, “Our approach is based on the notion of using causality to describe the training and operation of different machine learning techniques so that operational users can understand, trust and correctly interpret the results from these complex and mission-critical tools.”
Jensen adds, “The problem of explainability is fascinating. On the one hand, the field of machine learning has recently made big advances in how to automatically learn complex models. On the other hand, that’s left us with a major new challenge of explaining how these models work, and this ability to explain is extremely important.” For example, he notes, “Self-driving cars are coming soon, and they are certain to use machine learning. They’ll be trained on millions of hours of real and simulated driving so they can operate safely.”
“Let’s say our autonomous vehicle suddenly swerves and runs into a lamppost. We need to know why. It could be that it tried to avoid hitting a small child who looked ready to run into the street. Later, it turns out that the running ‘child’was actually a fire hydrant. That’s important to know, and the system clearly needs more training. Or maybe there was a child. Then that was an OK decision and we may not need to correct anything. This kind of information is important for many people, including the scientists creating new methods for machine learning, engineers designing systems that use those methods, and for users who are riding in those cars.”
He adds, “Right now, these systems, which were trained by data and not engineered by a human hand, can be amazingly opaque and complicated. We are studying how to provide a chain of reasoning that is understandable, and that leads to a valid understanding of the decisions they make.”
In teaching machines to detect pedestrians, for example, Jensen says he and colleagues will use a large data set of images showing real people walking in crosswalks. They’ll then run thousands or millions of experiments in which single variables or small sets of variables are changed slightly. “We can make the people tall or short, have them wearing a backpack or carrying something, or put them in rain, fog or snow,” he points out. “We can take multiple real images and blend them, or create synthetic people and place them in realistic scenes.”
The investigators will use techniques analogous to what neuroscientists use to study neurons in the brain, Jensen says. Figuring out which inputs produce “high activation” in the system will provide clues about its decision making. “We want to find out how characteristics of the image affect high-level constructed features that show up very deep in the neural network,” he explains. “There might be a ‘red backpack recognition node,’ for example. This process will help us to develop a causal model of how the system works and to generate an explanation of why it makes certain decisions.”
Another reason to seek explainability is that “sometimes a system finds a really crazy way to reach its goal, with a failure that we never saw coming,” the AI expert says. “We need to know the internals so our systems don’t have unexpected failure modes. This is deeply important for practical and scientific reasons. There are many applications in which we care deeply about the ‘why’ of the decisions, in fields such as medicine, hiring, law, and fraud detection, as well as computer science.”
The researchers hope over the four years to develop methods that in a short time can explain any new deep neural network. “It’s an amazing time to be in machine learning,” Jensen says. “There are huge amounts of data available, a lot of scientific progress and our students are getting exciting new jobs. We are entering a new age of AI applications, and machine learning is the core technology that is enabling this transition. The current generation of AI systems offer many benefits, but their effectiveness is limited by their inability to explain their own reasoning. We‘re going to fix this.”