Partner: Media Cloud
The surge in multimodal content shared online, particularly on platforms like YouTube and Instagram, has increased the need for effective extreme and hateful speech detection systems. Current systems often fail to address the nuanced challenges of detecting explicit and implicit hate speech in multimodal contexts, where speech and text combine to convey harmful messages.
Media Cloud, an open-source media research platform, helps researchers study news and information flow globally. This DS4CG team worked in collaboration with Media Cloud to focus on advancing multimodal hate speech detection by addressing three key challenges: the lack of comprehensive, human-annotated datasets; the absence of systems capable of analyzing both audio and text data simultaneously; and the need for fine-grained detection of subtle hate speech. The study leverages distinct latent features from audio and text to improve detection accuracy, revealing critical differences in outputs when analyzing text alone versus both modalities.
By integrating audio data, the model demonstrated enhanced efficacy in detecting implicit hate speech within YouTube videos. This approach not only fills a critical gap in multimodal hate speech detection but also provides actionable insights for improving content moderation and combating online harm.