Jacob Haimes

This article explores LLMs and their potential for deceptive behaviors in blackjack.

2025-07-21T06:58:12+00:00 ― 4 min read

A look into the strengths and weaknesses of CyberSecEval in code security.

2025-05-24T09:31:30+00:00 ― 6 min read

Learn how sandbagging affects AI assessments and ways to detect it.

2025-04-25T09:07:00+00:00 ― 6 min read