Zhaoran Wang

This study combines Large Language Models with Monte-Carlo Tree Search for better game decision-making.

2025-08-31T06:47:00+00:00 ― 6 min read

This article discusses the essential aspects of constrained reinforcement learning and its real-world applications.

2025-08-30T09:14:16+00:00 ― 4 min read

A new method enhances language models by actively seeking diverse responses.

2025-08-05T06:41:00+00:00 ― 6 min read

Introducing a method to minimize overoptimization in models trained with human feedback.

2025-07-26T04:46:48+00:00 ― 5 min read

This paper discusses a method for robots to learn safety from human input.

2025-07-19T16:07:42+00:00 ― 7 min read

A new method enhances language model training using self-generated feedback.

2025-06-04T15:08:42+00:00 ― 6 min read

A new method improves coding models using self-generated tests.

2025-05-19T03:37:20+00:00 ― 6 min read

Explore how data's value influences pricing strategies for businesses.

2025-02-02T01:57:54+00:00 ― 6 min read

Learn how robots can improve by following human commands and adapting to mistakes.

2025-01-22T09:09:54+00:00 ― 7 min read