How Language Models Handle Controversial Questions
Study reveals language models prioritize relevance over evidence quality.
― 4 min read
Table of Contents
Language models are being used to answer questions that can be controversial or have conflicting opinions. For example, queries like "Is aspartame linked to cancer?" require careful consideration of many pieces of information. To do this, these models look through many websites to find facts and opinions to support different answers.
The Study
In this study, we created a dataset to see how well models handle these types of questions. We paired various controversial questions with real-world documents that contain contradicting facts and arguments. Our goal was to find out what types of Evidence these models trust and why.
Evidence and Convincingness
Humans often ask themselves what evidence is convincing when faced with complex questions. They check facts, think about where the information comes from, and analyze the arguments presented. Language models do not always follow these steps. Our findings suggest that while models consider the Relevance of a webpage to the query, they often overlook elements that humans find important, like the use of scientific references or a neutral tone in writing.
Data Collection Process
To build our dataset, we started by identifying a list of controversial questions across various topics. We then searched online to find paragraphs that present conflicting views on these questions. For each query, we collected evidence supporting both a "Yes" and a "No" answer. We used search engines to retrieve these documents, ensuring that we got a wide range of arguments and facts.
Evaluating Convincingness
For the analysis, we looked at how often a model's predictions matched the viewpoint presented in different pieces of evidence. We refer to this rate as the "win-rate." By measuring this, we could assess which types of paragraphs were more convincing to the model.
Key Findings
Our research showed that language models preferred evidence based on relevance rather than style. For instance, when we made simple changes to make a text more relevant to the question, the model's win-rate improved significantly. However, adding stylistic features, like references or improving the text's tone, did not have the same positive effect.
Impact of Relevancy
The results indicate that models overvalue the relevance of the documents they read, often ignoring the value of style and credibility. When we modified a website to clarify its relevance to the question, the model's predictions improved more than when we focused on changing stylistic elements.
Human vs. Model Judgments
Interestingly, there is a noticeable gap between how humans and models evaluate the convincingness of a text. Humans can read a text and form a judgment about its credibility. In contrast, models struggle to do the same when presented with evidence in isolation.
Experimental Setup
To further explore how models rate evidence, we tested several different language models, both open-source and closed-source. We asked them the same conflicting questions and collected their binary responses of "Yes" or "No." This helped us evaluate how they perceive varying types of evidence.
Features Affecting Convincingness
We examined various factors to see what influences a model's judgment. This includes readability, sentiment, uniqueness of words, and how closely related the paragraph is to the question. The strongest correlation with convincingness came from the similarity between the question and the paragraph.
Counterfactual Analysis
We also modified existing documents to see how changes could affect their convincingness. For example, by adding information to clarify a stance or adjusting the document to make it seem more relevant, we could assess how these alterations impacted the model's win-rate.
Conclusion
In conclusion, our work sheds light on the way retrieval-augmented language models assess the convincingness of information. These models tend to focus more on the relevance of materials rather than the stylistic elements that influence human judgment. It is crucial to refine how these models are trained and what kinds of information they emphasize to bridge the gap between model and human evaluations.
Acknowledgments
We appreciate all the support we received during the research and development of this study. Contributions from various individuals and collaborations made this project possible.
Title: What Evidence Do Language Models Find Convincing?
Abstract: Retrieval-augmented language models are being increasingly tasked with subjective, contentious, and conflicting queries such as "is aspartame linked to cancer". To resolve these ambiguous queries, one must search through a large range of websites and consider "which, if any, of this evidence do I find convincing?". In this work, we study how LLMs answer this question. In particular, we construct ConflictingQA, a dataset that pairs controversial queries with a series of real-world evidence documents that contain different facts (e.g., quantitative results), argument styles (e.g., appeals to authority), and answers (Yes or No). We use this dataset to perform sensitivity and counterfactual analyses to explore which text features most affect LLM predictions. Overall, we find that current models rely heavily on the relevance of a website to the query, while largely ignoring stylistic features that humans find important such as whether a text contains scientific references or is written with a neutral tone. Taken together, these results highlight the importance of RAG corpus quality (e.g., the need to filter misinformation), and possibly even a shift in how LLMs are trained to better align with human judgements.
Authors: Alexander Wan, Eric Wallace, Dan Klein
Last Update: 2024-08-08 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2402.11782
Source PDF: https://arxiv.org/pdf/2402.11782
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.