Faculty Host: Akshay Krishnamurthy
Abstract: For complex diseases like depression, choosing a successful treatment from several possible drugs remains a trial-and-error process in current clinical practice. By applying statistical machine learning to the electronic health records of thousands of patients, can we discover subtypes of disease which both improve population-wide understanding and improve patient-specific drug recommendations?
One popular approach is to represent noisy, high-dimensional health records as mixtures of low-dimensional subtypes via a probabilistic topic model. I will introduce this common dimensionality reduction method and explain how off-the-shelf topic models are misspecified for downstream prediction tasks across many domains from text analysis to healthcare. To overcome these poor predictions, I will introduce a new framework -- prediction-constrained training -- which learns interpretable topic models that offer competitive drug recommendations. I will also discuss open challenges in using machine learning to improve clinical decision-making.
Bio: I am currently a postdoctoral fellow in computer science at Harvard SEAS, advised by Prof. Finale Doshi-Velez. We are actively exploring applications of machine learning to clinical medicine, especially combination therapies for major depression and interventions in the Intensive Care Unit (ICU).
Our recent methods cover two exciting areas of core ML research: (1) Semi-supervised learning: We have new objectives for training semi-supervised latent variable models can simultaneously discovering disease subtypes and suggest useful treatments. (2) Explainable AI: Our upcoming AAAI '18 paper shows how to optimize deep neural networks to have more interpretable decision boundaries, especially for clinical tasks.
I completed my Ph.D. in computer science at Brown University in May 2016, advised by Prof. Erik Sudderth. My thesis studied large-scale unsupervised clustering problems like organizing every New York Times article from the last 20 years or segmenting the human genome to find patterns in the epigenetic modifiers that amplify or inhibit expression. My technical focus was developing reliable non-convex optimization algorithms for a broad family of Bayesian nonparametric models that include mixtures, topic models, sequential models, and relational models. We have released an open-source Python package called BNPy. Please try it out!
A reception, for attendees, will be held at 3:30 in CS 150 (the back of the presentation room).