Simple Science

Cutting edge science explained simply

Cutting edge science explained simply

Xinyue Shen

Cryptography and Security Jailbreak Attacks on Language Models: A Growing Threat

Examination of jailbreak attacks shows weaknesses in language model safety.

2025-09-09T19:37:12+00:00 ― 5 min read

Cryptography and Security Evaluating Image Safety Classifiers with UnsafeBench

A new framework assesses the effectiveness of image safety classifiers against harmful content.

2025-08-13T09:48:48+00:00 ― 10 min read