Rishabh Joshi

Research aims to make language models safer and more useful for users.

2025-08-21T06:36:48+00:00 ― 6 min read

A fresh approach to training reward models enhances AI alignment with human preferences.

2025-06-09T16:00:54+00:00 ― 6 min read

This method helps AIs learn through creating and solving challenges.

2025-05-26T00:12:48+00:00 ― 7 min read