A look at how o1 models plan actions and their performance across various tasks.
Kevin Wang, Junbo Li, Neel P. Bhatt
― 7 min read
Cutting edge science explained simply
A look at how o1 models plan actions and their performance across various tasks.
Kevin Wang, Junbo Li, Neel P. Bhatt
― 7 min read
A look into how word embeddings are analyzed using independent component analysis.
Momose Oyama, Hiroaki Yamagiwa, Hidetoshi Shimodaira
― 5 min read
A new method for assessing AI-generated medical explanations using Proxy Tasks.
Iker De la Iglesia, Iakes Goenaga, Johanna Ramirez-Romero
― 5 min read
Exploring how smaller models struggle with inaccuracies from larger counterparts.
Phil Wee, Riyadh Baghdadi
― 6 min read
LLM-Ref aids researchers in crafting clearer, well-structured papers effortlessly.
Kazi Ahmed Asif Fuad, Lizhong Chen
― 6 min read
Exploring how well AI understands human communication.
Mingyue Jian, Siddharth Narayanaswamy
― 6 min read
Research shows new methods to better align LLMs with human feedback.
Zichen Liu, Changyu Chen, Chao Du
― 6 min read
A study compares human and AI creativity in storytelling.
Mete Ismayilzada, Claire Stevenson, Lonneke van der Plas
― 6 min read
Assessing prompt engineering's relevance with new reasoning models.
Guoqing Wang, Zeyu Sun, Zhihao Gong
― 7 min read
A look at in-context databases and their potential with language models.
Yu Pan, Hongfeng Yu, Tianjiao Zhao
― 5 min read
Assessing the role of multilingual models in supporting bilingual students.
Anand Syamkumar, Nora Tseng, Kaycie Barron
― 6 min read
Examining vulnerabilities in watermarking methods against paraphrasing attacks.
Saksham Rastogi, Danish Pruthi
― 7 min read
Assessing language models' understanding of proverbs in low-resource languages.
Israel Abebe Azime, Atnafu Lambebo Tonja, Tadesse Destaw Belay
― 5 min read
Investigating how wealth influences language models in travel narratives.
Kirti Bhagat, Kinshuk Vasisht, Danish Pruthi
― 7 min read
Scar enhances language models by reducing toxic language in text generation.
Ruben Härle, Felix Friedrich, Manuel Brack
― 5 min read
Research shows variation in speech improves language model training.
Akari Haga, Akiyo Fukatsu, Miyu Oba
― 5 min read
Explore the impact of question styles on AI model performance.
Jia He, Mukund Rungta, David Koleczek
― 5 min read
A new method to develop guardrails for large language models without real-world data.
Gabriel Chua, Shing Yee Chan, Shaun Khoo
― 6 min read
A new method enhances the safety of code generated by language models.
Xiangzhe Xu, Zian Su, Jinyao Guo
― 5 min read
SpecTool brings clarity to LLM errors in using tools.
Shirley Kokane, Ming Zhu, Tulika Awalgaonkar
― 4 min read
A study reveals how prompt injection can compromise language models.
Jiashuo Liang, Guancheng Li, Yang Yu
― 10 min read
This study examines how well LLMs assess creativity in the Alternative Uses Test.
Abdullah Al Rabeyah, Fabrício Góes, Marco Volpe
― 5 min read
PEFT methods enhance language models while safeguarding private data.
Olivia Ma, Jonathan Passerat-Palmbach, Dmitrii Usynin
― 7 min read
A study on how well language models connect facts without shortcuts.
Sohee Yang, Nora Kassner, Elena Gribovskaya
― 7 min read
A new method for language models to enhance their responses through self-generated critiques.
Yue Yu, Zhengxing Chen, Aston Zhang
― 6 min read
How low-bit quantization affects large language models during training.
Xu Ouyang, Tao Ge, Thomas Hartvigsen
― 6 min read
A new method automates news classification, saving time and resources for organizations.
Taja Kuzman, Nikola Ljubešić
― 4 min read
Evaluating if language models can understand spatial relationships effectively.
Anthony G Cohn, Robert E Blackwell
― 6 min read
Discover how to improve large language models in handling symmetric tasks.
Mohsen Dehghankar, Abolfazl Asudeh
― 7 min read
Evaluating language models' abilities in synthetic data creation using AgoraBench.
Seungone Kim, Juyoung Suk, Xiang Yue
― 5 min read
How language models improve their understanding of grammar and sentence structures.
Tian Qin, Naomi Saphra, David Alvarez-Melis
― 8 min read
Exploring how transformers can express uncertainty to improve AI reliability.
Greyson Brothers, Willa Mannering, Amber Tien
― 6 min read
Large language models excel in some areas but struggle with general tasks.
Basab Jha, Ujjwal Puri
― 7 min read
Discover how activation sparsity boosts AI efficiency and speed.
Vui Seng Chua, Yujie Pan, Nilesh Jain
― 5 min read
Explore the connections between language models and physical phenomena in an engaging way.
Yuma Toji, Jun Takahashi, Vwani Roychowdhury
― 9 min read
Researchers are improving AI's ability to tackle complex questions with AutoReason.
Arda Sevinc, Abdurrahman Gumus
― 5 min read
Researchers tackle biases in language models for Filipino, enhancing cultural relevance.
Lance Calvin Lim Gamboa, Mark Lee
― 5 min read
This article examines the complex role of English in multilingual evaluations.
Wessel Poelman, Miryam de Lhoneux
― 7 min read
Learn how Sloth is changing predictions for language model performance.
Felipe Maia Polo, Seamus Somerstep, Leshem Choshen
― 6 min read
BatchTopK sparse autoencoders improve language processing through smart data selection.
Bart Bussmann, Patrick Leask, Neel Nanda
― 5 min read