MalAlgoQA dataset evaluates reasoning of Large Language Models in counterfactual scenarios.
― 5 min read
Cutting edge science explained simply
MalAlgoQA dataset evaluates reasoning of Large Language Models in counterfactual scenarios.
― 5 min read