This week's event will be in the format of whiteboard presentations. Presenters will be at the whiteboards describing their latest ideas and recent work. Multiple presentations will be happening at the same time, similar to a poster session. It will be a fun discussion-based Data Science Tea where students will be exchanging ideas-- a great forum for both presenters to get feedback and the audience to ask questions. Students can bring their own work to the event to present on the spot! Alternatively, let me know by email (email@example.com) if you are interested.
Presenter: Myung-ha Jang (PhD Student Advised by Prof. James Allan)
Title: Probabilistic Approaches to Automated Controversy Detection
Abstract: Recently, the problem of automated controversy detection has attracted a lot of interest in the information retrieval community. Existing approaches to this problem have set forth a number of detection algorithms, but there has been little effort to model the probability of controversy in a document directly. In this paper, we propose a probabilistic framework to detect controversy on the web, and investigate two models. We first introduce a state-of-the-art controversy detection algorithm and recast it into a model in our framework. Based on insights from social science research, we also introduce a language modeling approach to this problem. We evaluate different methods of creating controversy language models based on a diverse set of public datasets including Wikipedia, Web and News corpora.
Presenter: Haw-Shiuan Chang (PhD Student advised by Prof. Andrew McCallum),
Title: Active Learning for Universal Schema
Abstract: Universal Schema is an effective approach to extract relations from documents. However, the insufficiency of training data in this task hurts the performance of Universal Schema. This motivates us to develop active learning algorithms for Universal Schema. In this presentation, I will talk about why the task is challenging and our preliminary ideas to overcome these challenges. Currently, we haven't implemented our algorithm and haven't derived any theoretical guarantee. Any feedback or suggestion would be highly appreciated.
Presenter: Liu Yang (PhD Student Advised by Prof. W. Bruce Croft)
Title: aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model
Abstract: As an alternative to question answering methods based on feature engineering, deep learning approaches such as convolutional neural networks (CNNs) and Long Short-Term Memory Models (LSTMs) have recently been proposed for semantic matching of questions and answers. To achieve good results, however, these models have been combined with additional features such as word overlap or BM25 scores. Without this combination, these models perform significantly worse than methods based on linguistic feature engineering. In this paper, we propose an attention based neural matching model for ranking short answer text. We adopt value-shared weighting scheme instead of position-shared weighting scheme for combining different matching signals and incorporate question term importance learning using question attention network. Using the popular benchmark TREC QA data, we show that the relatively simple aNMM model can significantly outperform other neural network models that have been used for the question answering task, and is competitive with models that are combined with additional features. When aNMM is combined with additional features, it outperforms all baselines. This work is published as a full paper in CIKM'16 and it can be downloaded through http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1240 .