Meta | Robustness of Named-Entity Replacements for In-Context Learning (PUBLICATION, JOURNAL/CONF, INCL LINK)
Participants: Saeed Goodarzi, Nikhil Kagita, Dennis Minn, Shufan Wang, Roberto Dessi, Shubham Toshniwal, Adina Williams, Jack Lanchantin, Koustuv Sinha
Project: We analyzed the sensitivity of LLM in-context learning with respect to named entities, and offered a simple recipe to improve test performance by hyper-parameter tuning the named entities for a given dataset.
Microsoft | On Surgical Fine-tuning for Language Encoders
On Surgical Fine-tuning for Language Encoders
Participants: Abhilasha Lodha, Gayatri Belapurkar, Saloni Chalkapurkar, Yuanming Tao, Reshmi Ghosh, Samyadeep Basu, Dmitrii Petrov, Soundararajan Srinivasan
Project: For different downstream language tasks, fine-tuning only a subset of layers is sufficient to obtain performance that is close to and often better than fine-tuning all the layers in the language encoder. We proposed an efficient metric to select the candidate layers for selective fine-tuning and showed that this metric can effectively select layers leading to a strong downstream performance.
NVIDIA | On Surgical Fine-tuning for Language Encoders
Participants: Anshita Gupta, Debanjan Mondal, Akshay Sheshadri, Wenlong Zhao, Xiang Li, Sarah Wiegreffe, Niket Tandon
Project: We investigated whether commonsense judgments are causally associated with localized, editable parameters in Transformers. We found that directly applying the MEMIT editing algorithm results in sub-par performance and improved it for the commonsense domain by varying edit tokens and improving the layer selection strategy.