A new benchmark aims to measure and mitigate AI-related dangers.
― 5 min read
Cutting edge science explained simply
A new benchmark aims to measure and mitigate AI-related dangers.
― 5 min read
This article discusses issues and best practices for evaluating language models.
― 7 min read
Circuit breakers provide a new method to prevent harmful AI outputs effectively.
― 3 min read
A new method improves tamper resistance in open-weight language models.
― 7 min read