This article examines the threat of backdoor attacks on language model agents.
― 5 min read
Cutting edge science explained simply
This article examines the threat of backdoor attacks on language model agents.
― 5 min read
Research reveals significant security risks in chat models from backdoor attacks.
― 6 min read
Explores the challenges of supervising advanced AI models with weaker counterparts.
― 6 min read