Felix Friedrich

A new model improves AI's ability to learn from user feedback.

2025-10-04T16:59:30+00:00 ― 6 min read

AI improves decision-making through self-explanations and feedback loops.

2025-09-26T13:43:48+00:00 ― 7 min read

The impact of language on gender stereotypes in image generation technologies.

2025-09-13T14:04:30+00:00 ― 5 min read

ALERT benchmark assesses safety risks in language models to improve their responses.

2025-08-21T19:15:48+00:00 ― 4 min read

A new system assesses safety risks in images generated by AI models.

2025-07-31T23:59:00+00:00 ― 7 min read

Scar enhances language models by reducing toxic language in text generation.

2025-05-25T19:01:12+00:00 ― 5 min read

This article examines shortcut learning issues in machine learning and how to address them.

2025-04-07T05:57:45+00:00 ― 7 min read

M-ALERT tests language models for safety across five languages.

2025-02-12T21:58:30+00:00 ― 5 min read