Past Projects and Publications [DRAFT] – CENTER FOR DATA SCIENCE AND ARTIFICIAL INTELLIGENCE

RECENT PUBLICATIONS

Simple Strategies to Select Layers for Fine-Tuning Language Encoders

Entities in Context

Editing Transformer Models with Common Sense Knowledge (EMNLP Conference, Dec. 2023)

Entities in Context

Simple Strategies to Select Layers for Fine-Tuning Language Encoders

Editing Transformer Models with Common Sense Knowledge (EMNLP Conference, Dec. 2023)

PAST Projects

Meta | Robustness of Named-Entity Replacements for In-Context Learning (PUBLICATION, JOURNAL/CONF, INCL LINK)

Robustness of Named-Entity Replacements for In-Context Learning

Partner: Meta

Participants: Saeed Goodarzi, Nikhil Kagita, Dennis Minn, Shufan Wang, Roberto Dessi, Shubham Toshniwal, Adina Williams, Jack Lanchantin, Koustuv Sinha

Project: We analyzed the sensitivity of LLM in-context learning with respect to named entities, and offered a simple recipe to improve test performance by hyper-parameter tuning the named entities for a given dataset.

Microsoft | On Surgical Fine-tuning for Language Encoders

On Surgical Fine-tuning for Language Encoders

Partner: Microsoft

Participants: Abhilasha Lodha, Gayatri Belapurkar, Saloni Chalkapurkar, Yuanming Tao, Reshmi Ghosh, Samyadeep Basu, Dmitrii Petrov, Soundararajan Srinivasan

Project: For different downstream language tasks, fine-tuning only a subset of layers is sufficient to obtain performance that is close to and often better than fine-tuning all the layers in the language encoder. We proposed an efficient metric to select the candidate layers for selective fine-tuning and showed that this metric can effectively select layers leading to a strong downstream performance.

NVIDIA | On Surgical Fine-tuning for Language Encoders

TITLE

Partner: NVIDIA

Participants: Anshita Gupta, Debanjan Mondal, Akshay Sheshadri, Wenlong Zhao, Xiang Li, Sarah Wiegreffe, Niket Tandon

Project: We investigated whether commonsense judgments are causally associated with localized, editable parameters in Transformers. We found that directly applying the MEMIT editing algorithm results in sub-par performance and improved it for the commonsense domain by varying edit tokens and improving the layer selection strategy.