David Krueger

Discover the need for visibility and governance in AI agent operations.

2025-09-15T03:04:24+00:00 ― 7 min read

Access levels in AI audits influence effectiveness and risk identification.

2025-09-14T07:35:12+00:00 ― 7 min read

Exploring how to build safety cases for AI technologies.

2025-08-29T01:59:06+00:00 ― 8 min read

Developers must prove AI systems are safe to manage risks effectively.

2025-08-19T19:28:06+00:00 ― 6 min read

A new model concept shows how to test AI capabilities effectively.

2025-08-05T08:15:48+00:00 ― 7 min read

Learn why unique IDs for AI systems enhance safety and trust.

2025-07-28T00:31:30+00:00 ― 7 min read

Examining the difficulties of creating effective reward functions in reinforcement learning.

2025-07-11T06:48:12+00:00 ― 8 min read

This article analyzes model performance across various tasks and datasets.

2025-07-08T02:42:24+00:00 ― 5 min read

New method BaDLoss enhances protection against data poisoning in machine learning.

2025-06-23T07:12:24+00:00 ― 7 min read

This article discusses methods to better understand neural networks through Sparse Autoencoders and Mutual Feature Regularization.

2025-06-01T07:45:00+00:00 ― 5 min read

Exploring how transformers learn and the challenges they face against attacks.

2025-05-28T19:00:45+00:00 ― 5 min read

Researchers develop a method for AI to coordinate without full information.

2025-05-28T14:26:42+00:00 ― 6 min read

A study on two approaches to improve AI's performance in language tasks.

2025-05-25T22:03:54+00:00 ― 5 min read

Learn how machines can forget unnecessary data for better privacy.

2025-04-27T11:27:00+00:00 ― 6 min read