Abstract: Today's finance industry is continuously searching for alpha, through more advanced modeling and through alternative sources of data. Many alternative sources of data are not raw numerical data but natural documents - human-readable documents containing tables, graphics and text. In this talk, I will present an overview of how we use NLP at Bloomberg to extract information from natural documents, and highlight some research challenges. I also present a case study of how such information can be combined with market data for better predictive modeling.
Amanda Stent is a NLP architect in the data science group in the office of the CTO at Bloomberg LP. Previously, she was a director of research and principal research scientist at Yahoo Labs, a principal member of technical staff at AT&T Labs - Research, and an associate professor in the Computer Science Department at Stony Brook University. Her research interests center on natural language processing and its applications. She holds a PhD in computer science from the University of Rochester. She is co-editor of the book Natural Language Generation in Interactive Systems (Cambridge University Press), has co-authored over 100 papers on natural language processing and is co-inventor on over 25 patents and patent applications.