Latest Articles for Content Moderation

CAST offers a precise approach to managing language model responses.

2025-06-16T07:27:48+00:00 ― 7 min read

Examining how flagging mechanisms influence user trust and perceptions of fairness.

2025-06-13T09:32:54+00:00 ― 6 min read

Using open-source models to improve harmful content detection efficiently and effectively.

2025-05-21T10:37:12+00:00 ― 6 min read

A new dataset aims to improve the classification of harmful content online.

2025-05-01T05:48:00+00:00 ― 7 min read

Examining how labeling strategies affect minority voices in sexism detection.

2025-04-09T23:46:03+00:00 ― 7 min read

New methods improve detection of offensive language using sentiment analysis.

2025-04-01T05:30:45+00:00 ― 6 min read

Exploring new methods for effective social media content moderation.

2025-02-09T07:18:18+00:00 ― 5 min read

Examining the support and harm of online spaces for eating disorders.

2025-02-07T18:23:24+00:00 ― 5 min read

The SAFE-MEME framework helps identify hate speech hidden in memes.

2025-01-20T03:23:51+00:00 ― 7 min read