Bochuan Cao

A study reveals the WordGame attack, exploiting weaknesses in LLM safety measures.

2025-08-09T04:17:54+00:00 ― 5 min read

A novel method improves understanding of language model outputs.

2025-08-04T08:02:12+00:00 ― 4 min read

Exploring the self-correction processes in language models and their effects.

2025-08-02T12:27:18+00:00 ― 5 min read

New method enables backdoor attacks without clean data or model changes.

2025-03-27T16:22:21+00:00 ― 7 min read