Allen Institute for AI (AI2)’s Post

Data toxicity can lead to harmful model outputs — and since most evaluations focus on English datasets, we’re underestimating multilingual toxicity in state-of-the-art LLMs. Our team partnered with researchers from Carnegie Mellon University and University of Virginia to highlight this gap.

PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language…

PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language…

blog.allenai.org

To view or add a comment, sign in

Explore topics