Hongyi Guo

This study combines Large Language Models with Monte-Carlo Tree Search for better game decision-making.

2025-08-31T06:47:00+00:00 ― 6 min read

Introducing a method to minimize overoptimization in models trained with human feedback.

2025-07-26T04:46:48+00:00 ― 5 min read