Entities in Context

  • Post category:Projects

Entities in ContextPartner: Meta Platforms Inc.Participants: Saeed Goodarzi, Nikhil Kagita, Dennis MinnDescription: We revisited the generalization effectiveness of LLMs by focusing on named entities. Named entities are ubiquitous in current Natural Language Understanding benchmarks, yet they have been largely ignored in order to examine the impact on models' reasoning capabilities. We subjected models to the same evaluation data while modifying them to iterate through a large array of named entities from diverse demographics.

Continue ReadingEntities in Context

Simple Strategies to Select Layers for Fine-Tuning Language Encoders

  • Post category:Projects

Simple Strategies to Select Layers for Fine-Tuning Language Encoders Partner: Microsoft, MAIDAP Participants: Gayatri Belapurkar, Saloni Chalkapurkar, Abhilasha Lodha, Yuanming Tao Description: We proposed two-layer selection methods for fine-tuning language encoders that can comprehensively make the transfer learning process for common NLP tasks such as GLUE and SuperGLUE more resource efficient.

Continue ReadingSimple Strategies to Select Layers for Fine-Tuning Language Encoders

Editing Transformer Models with Common Sense Knowledge (EMNLP Conference, Dec. 2023)

  • Post category:Projects

Editing Transformer Models with Common Sense Knowledge (EMNLP Conference, Dec. 2023)Partner: Allen Institute for AIParticipants: Anshita Gupta, Debanjan Mondal, Akshay Krishna SheshadriDescription: Memory editing for updating encyclopedic knowledge in transformers has received increasing attention, but it is unclear if these methods can be adapted for nuanced common sense knowledge. In this research, we proposed an adaptation of MEMIT to edit common sense mistakes in GPT-2 Large and XL. We extend editing to various token locations and employ a robust layer selection strategy. Our results suggest a promising path for improving GPT by incorporating context-specific user feedback about common sense through direct model editing as well as fixing and customizing model behaviors using human-in-the-loop-systems.

Continue ReadingEditing Transformer Models with Common Sense Knowledge (EMNLP Conference, Dec. 2023)

Generating Metrics for High-Performance Computing Clusters

  • Post category:Projects

Partner: Unity DS4CG 2023. Unity is a collaborative, multi-institutional high-performance computing cluster, primarily used for research computing. The Unity project focused on generating useful metrics and analysis for Unity by building a pipeline to a database that could power a live dashboard for Unity’s admin staff. Metrics included unnecessarily idle GPUs, daily and weekly node usage, total resource usage, and wait time. Additionally, a prediction model for wait time at job submission time was built.

Continue ReadingGenerating Metrics for High-Performance Computing Clusters

Detecting Extreme Speech in YouTube Videos

  • Post category:Projects

Partner: Media Cloud DS4CG 2023. With the current surge in multimodal media shared online, particularly from platforms like YouTube and Instagram, the need for multimodal hate speech detection systems has grown. We created an evaluation dataset of YouTube videos from specific types of content creators and compared how features from the audio and transcribed text in a video can be used to flag extreme speech using machine learning.

Continue ReadingDetecting Extreme Speech in YouTube Videos

Extracting Bylines from Media in Multiple Languages

  • Post category:Projects

Partner: Media Cloud DS4CG 2023. In an increasingly digital landscape for media across the globe, the need for content moderation, misinformation detection, and bias analysis has risen. In partnership with Media Cloud, we evaluated existing tools for extracting author names in news articles across 10 languages, which allow researchers to analyze media based on the author. We designed a pipeline to execute byline extraction from the evaluated tools and determined that a multistep approach of heuristic and machine learning models will lead to the best byline extraction tool.

Continue ReadingExtracting Bylines from Media in Multiple Languages