Dawn Song

Artificial Intelligence Navigating the Dual-Use Dilemma of Generative AI

This article examines the risks and benefits of generative AI technology.

2025-10-03T19:55:30+00:00 ― 7 min read

Artificial Intelligence Enhancing Trust in Language Models with Retrieval Systems

This study investigates how retrieval-augmented models improve reliability in language generation.

2025-09-10T23:39:54+00:00 ― 7 min read

Computation and Language The Evolution of Free-Formed AI Groups

Examining the social dynamics and benefits of autonomous AI interactions.

2025-09-06T07:13:18+00:00 ― 9 min read

Machine Learning Evaluating the Robustness of Multimodal Models

This article reviews the robustness of CLIP in various challenges.

2025-08-29T02:14:54+00:00 ― 5 min read

Computation and Language Trust and Efficiency in Compressed Language Models

Exploring the balance between model compression and trustworthiness in AI.

2025-08-28T12:01:42+00:00 ― 5 min read

Cryptography and Security New Framework for Safe Content Moderation in LLMs

A novel approach to enhance moderation methods for large language models.

2025-08-27T23:23:18+00:00 ― 5 min read

Computation and Language Understanding Code Hallucinations in Language Models

An overview of code hallucinations in LLMs and their impact on software development.

2025-08-15T01:58:18+00:00 ― 6 min read

Machine Learning Introducing GuardAgents: A New Safety Layer for LLMs

A framework to enhance safety in LLM agents across various applications.

2025-07-29T07:43:48+00:00 ― 7 min read

Cryptography and Security Addressing Safety Backdoor Attacks in AI Models

A new method tackles hidden threats in large language models.

2025-07-25T00:46:00+00:00 ― 6 min read

Computers and Society Managing the Risks of Artificial Intelligence

A look at AI risk categories and the need for unified policies.

2025-07-24T14:21:54+00:00 ― 6 min read

Computation and Language Enhancing Language Models with Recursive Tuning

A new method improves language models' performance on complex problems.

2025-07-18T18:55:48+00:00 ― 5 min read

Computers and Society Ensuring AI Safety: New Benchmark Introduced

A new benchmark aims to assess AI safety risks effectively.

2025-07-15T13:14:48+00:00 ― 7 min read

Machine Learning Strengthening Safety in Open-Weight LLMs

A new method improves tamper resistance in open-weight language models.

2025-07-03T22:14:42+00:00 ― 7 min read

Machine Learning AutoScale: A New Tool for Data Optimization in LLMs

AutoScale improves data mix for efficient training of large language models.

2025-06-24T16:46:24+00:00 ― 6 min read

Computation and Language Teaching Robots with Language: A New Approach

Revolutionizing robot training with a focus on language-based instructions.

2025-05-22T19:53:51+00:00 ― 6 min read

Machine Learning Machine Unlearning: A Safer AI Future

Discover how machine unlearning improves AI safety and image quality.

2025-03-31T22:17:24+00:00 ― 6 min read

Cryptography and Security The Rise of Stealthy Backdoor Attacks in AI

New method enables backdoor attacks without clean data or model changes.

2025-03-27T16:22:21+00:00 ― 7 min read