Industrial data scientists conducting basic and applied research; data science faculty and students from UMass Amherst and the Five College consortium; current and prospective data science project managers and technical team leaders; research sponsors and consumers of data science research results.
The Alchemy Fund, Amazon, Appthority, Aptima, BAE Systems, Bloomberg, Burning Glass, C90, CA Technologies, CNE Direct, Colaberry Inc, Comcast, Cooley Dickinson Health Care, Court Square Group, Data Engines, EBSCO, Filtered.ai, Foundation Center, Genpact, Google, IBM, Intel, IPS, Kronos, Lexalytics, Lumme, MachineMetrics, MassMutual, MassTech Collaborative, MathWorks, MGHPCC, MiCoachee, Microsoft, MITRE, NVIDIA, Ofcounsels, OptumInsight, Oracle Corp, Oracle Labs, PSL Systems, Quantiphi, Raytheon-BBN Technologies, Springfield Venture Fund, State Street, USC ISI, Valley CDC, The Vanguard Group, VentureWell, Verizon, Voya, Wayfair
9:00 A.M. Welcome
9:15 A.M. Lightning Talks
10:45 A.M. BREAK
11:00 A.M. Lightning Talks
12:15 P.M. LUNCH (Campus Center Auditorium)
1:30 - 4:00 P.M. Industry Meet & Greet Session
1:30 P.M. R&D Dives - Session 1 (descriptions below)
2:30 P.M. Break
3:00 P.M. R&D Dives - Session 2
4:00 - 5:00 P.M. Reception
R&D Dives 1:30 - 2:30 P.M.
Addressing Bias and Discrimination in Software Systems
Yuriy Brun, Brendan O’Connor, & Phillip Thomas
Brun presentation, O'Connor presentation
Software impacts society in many ways, including by making automated decisions. Unfortunately, it is possible for software to exhibit bias in its operation, for example, because the software relies on data that itself exhibits bias, or because subsystems interact in unintended ways. This session will focus on recent research in detecting bias in software, data, and algorithms toward producing methods for reducing such bias and creating more fair, more accountable, and more transparent systems.
Blockchain Theory, Methods, and Security
Blockchains are a novel platform for supporting a variety of applications in a public setting. In this talk, I provide an explanation of how blockchains work, and why and when they are secure. I’ll offer my perspective on when it is that they shine in terms of applications and opportunities. In short, blockchains can be viewed as a secure cloud platform whose novelty arises from a unique synergy of technical mechanisms and economic markets. As part of the presentation, I’ll summarize some of my own recent work on improving blockchain performance and security.
Natural Language Interpretation, Representation and Reasoning with Deep Learning
Mohit Iyyer & Andrew McCallum
Iyyer presentation, McCallum presentation
We want to build a large-scale knowledge bases containing entities and relations. Work in knowledge representation and knowledge bases has long struggled to design schemas of entity- and relation-types that capture the desired balance of specificity and generality while also supporting reasoning and information integration from various sources of input evidence. In this talk I will describe our work in "universal schema," a deep learning approach to knowledge representation in which we operate on the union of all input schemas (from structured KBs to natural language textual patterns) while also supporting integration and generalization by learning vector embeddings whose neighborhoods capture semantic implicature. I will also discuss our work in (a) question answering with chains of reasoning, using reinforcement learning to guide the efficient search for meaningful chains, (b) learning and using large ontologies, and (c) common sense representation by learning box-shaped embedding.
The advent of conversational agents such as Amazon Echo and Google Home has resulted in an increased need for question answering systems capable of rich contextual understanding. In this session, I will discuss two projects involving contextual QA: (1) answering sequences of questions about semi-structured tables, and (2) answering questions about unstructured text in a teacher-student setting. In addition to explaining how we build deep learning models for these tasks that leverage the conversational history to produce answers, I discuss some limitations of these models in relation to their ability to handle world knowledge and commonsense reasoning. I conclude by discussing general-purpose extensions for deep learning models trained for natural language processing tasks that can lessen these issues and, in turn, improve performance on downstream tasks.
R&D Dives 3:00 - 4:00 P.M.
Explainable Artificial Intelligence: Opportunities and Challenges for Industry
Advances in artificial intelligence increasingly power critical infrastructure across nearly all areas of human activity, including healthcare, transportation, security, finance, education, media, and government. However, many of the most advanced and capable systems are largely opaque, having been developed by applying machine learning techniques to large amounts of data. This has spurred interest in explainable artificial intelligence, a class of techniques intended to automatically provide accurate explanations of the decisions produced by complex AI systems. In this talk, I will define explainable AI and discuss the goals of current research and development. I will identify various technical approaches being taken by researchers working in the area and provide examples of some of the most recent results. I will identify and discuss some common myths about explainable AI, and I will describe key challenges for applications of these technologies.
Analyzing Data Privately: Challenges and Recent Approaches
The benefits of data science rely on access to accurate and descriptive data, often about individuals. But data scientists also have a responsibility to respect the privacy of those in the populations they study. To protect privacy, we typically seek to prevent the disclosure of specific facts about individuals, yet provide utility by maintaining accurate aggregate properties of the population.
Achieving this balance is a challenge. In this talk I will briefly review some common pitfalls in protecting privacy and then describe Differential Privacy, a rigorous privacy definition that provides a formal foundation for understanding when privacy and utility goals are compatible.
Differential privacy is slowly moving from theory into practice. Examples of real-world deployments are growing, including by major companies and government institutions. Nevertheless, there remain many obstacles to allowing "data science with privacy". Using examples from real-world applications, I will explain some of the technical challenges involved in providing privacy guarantees when carrying out data science. And I will ponder to what extent we can (or must) ask data scientists to change their practices and expectations when privacy must be respected.
Center for Data Science Industry Partners
The Center for Data Science is grateful for the support provided by its industry affiliate partners: