Data Science & AI Ethics in Society – CENTER FOR DATA SCIENCE AND ARTIFICIAL INTELLIGENCE

Analyzing Political News Framing: A Dashboard for Cross-Spectrum News Insights

Post category:Projects
Post published:December 6, 2024

Media Cloud, an open-source media analysis platform, leveraged DS4CG students’ NPL expertise to improve ways to explore how media is reported and framed, supporting media literacy and critical analysis.

The App Danger Project

Post category:Projects
Post published:November 20, 2024

The DS4CG team developed tools to identify app reviews that raise safety concerns for minors, helping parents make informed decisions about child-safe apps.

Detecting Extreme Speech in YouTube Videos

Post category:Projects
Post published:November 1, 2023

Partner: Media Cloud The surge in multimodal content shared online, particularly on platforms like YouTube and Instagram, has increased the need for effective extreme and hateful speech detection systems. Current systems often fail to address the nuanced challenges of detecting explicit and implicit hate speech in multimodal contexts, where speech and text combine to convey harmful messages. Media Cloud, an open-source media research platform, helps researchers study news and information flow globally. This DS4CG team worked in collaboration with Media Cloud to focus on advancing multimodal hate speech detection by addressing three key challenges: the lack of comprehensive, human-annotated datasets; the absence of systems capable of analyzing both audio and text data simultaneously; and the need for fine-grained detection of subtle hate speech. The study leverages distinct latent features from audio and text to improve…

Extracting Bylines from Media in Multiple Languages

Post category:Projects
Post published:November 1, 2023

Partner: Media Cloud Over the past few decades, the rapid expansion of digital media has transformed how information is shared and consumed. However, this growth presents challenges such as content moderation, misinformation detection, and addressing media bias. Categorizing articles by authors or agencies has become a critical step in tackling these issues, especially for both high- and low-resource settings. This DS4CG project evaluated existing and newly implemented tools for extracting author names from news articles. Using Media Cloud’s article archive, 100 documents from 10 languages were sampled and annotated by volunteers fluent in each language, following guidelines developed with the Media Cloud team. A pipeline was designed to test these tools, and their performance was assessed using five NLP metrics.

Analysis of #StopAsianHate on Twitter

Post category:Projects
Post published:November 1, 2023

Partner: Co-Insights In partnership with Co-Insights, this DS4CG project explored the #StopAsianHate movement using NLP to analyze topic transitions, identify significant events, and highlight key accounts driving conversations. Unlike prior studies focused on peak activity, this longitudinal analysis examined changes over time, implementing text embedding and clustering models to uncover frequent unigrams, phrases, and example tweets.

Reddit Map

Post category:Projects
Post published:December 27, 2022

Reddit Map is an open-source tool that makes navigating Reddit data easier by displaying clusters of communities with overlapping community members.