Introducing IrokoBench to improve LLM evaluation in African languages.
― 7 min read
Cutting edge science explained simply
Introducing IrokoBench to improve LLM evaluation in African languages.
― 7 min read
This article examines methods to assess variance in language model evaluation benchmarks.
― 7 min read
This research focuses on improving methods for removing unwanted information from language models.
― 4 min read
This article discusses challenges in detecting hallucinations in machine translation across various languages.
― 5 min read
Linguini tests assess how well models reason with diverse languages.
― 6 min read
Are NLI tasks still relevant for testing large language models?
― 6 min read