Efforts to improve multilingual metrics for dialogue systems showcased in recent challenge.
― 8 min read
Cutting edge science explained simply
Efforts to improve multilingual metrics for dialogue systems showcased in recent challenge.
― 8 min read
AdvEval exposes weaknesses in Natural Language Generation evaluation metrics.
― 6 min read
Introducing a new model and benchmark for evaluating multi-audio tasks.
― 5 min read