DetectBench evaluates LLMs on their ability to detect hidden evidence in reasoning tasks.
― 5 min read
Cutting edge science explained simply
DetectBench evaluates LLMs on their ability to detect hidden evidence in reasoning tasks.
― 5 min read