Bertie Vidgen

A new test suite evaluates safety behaviors in language models.

2025-10-13T08:22:00+00:00 ― 5 min read

A review of datasets focused on enhancing LLM safety.

2025-08-21T08:04:18+00:00 ― 6 min read

Exploring the responsible use of generative AI technology in various fields.

2025-08-16T14:18:42+00:00 ― 7 min read

WorkBench tests agents' ability to perform realistic office tasks with a unique evaluation method.

2025-08-14T22:09:12+00:00 ― 6 min read

Examining the risks and opportunities of open-source generative AI technology.

2025-08-11T04:13:30+00:00 ― 5 min read

Learn best practices for developing AI models responsibly and effectively.

2025-07-24T22:08:00+00:00 ― 5 min read

Natural language unit tests offer a clearer method for assessing language models.

2025-02-25T02:53:15+00:00 ― 7 min read