The Role of Sources in News Reporting

Table of Contents

Original Source
Reference Links

News articles rely on sources to provide accurate information. Understanding when, how, and why reporters use sources can give us insights into the news we read. This understanding can also help journalists do their job better.

To support this, we created a large dataset that includes many examples of sources used in news articles. This dataset allows us to build models that can detect where information comes from and who provided it. We also introduced a new task, called source prediction, to study how sources work together in news stories. Our results show that we can effectively perform this task, which may help improve the way news articles are written and how journalists choose sources.

Journalism shapes our views, and the information we consume is based on the sources reporters use. Identifying these sources is relevant in various areas, such as detecting misinformation and understanding arguments in news discourse. Linking information to sources can be tough, as some attributions are clear, while others are more subtle. In the past, most efforts focused on simple cases, like identifying quotes, which resulted in high precision but missed many other instances.

Sources can combine in various ways within a single news article. Some sources are obvious, while others may be implied or unclear. Our main question is: does this article need another source?

Source Attribution

In our work, we define "source" broadly to include many ways journalists gather information. We identified 16 categories of sourcing and created the largest source-attribution dataset with over 28,000 attributions in more than 1,300 articles. By training models on this data, we achieved good accuracy in linking information to its sources.

We tested different methods and found that traditional lexical approaches and other models often struggled to perform well in this task. Many sentences possess sourced information that does not rely on clear keywords, making attribution challenging.

In the first part of our research, we focus on how to attribute sources. We establish criteria for what makes a sentence attributable to a source based on explicit or implicit signals. Sources can include individuals or organizations and can be mentioned directly or through more general terms.

We aim to maximize the number of attributions while also ensuring the same source is correctly identified across multiple sentences. This approach allows us to consider various information channels. Our dataset creation process involved recruiting annotators, including a professional journalist and a student, who worked together to label the articles. Their collaboration led to a high rate of agreement in identifying sources.

Source Attribution Models

We divided the source attribution task into two steps: Detection and Retrieval. Detection involves figuring out if a sentence can be linked to a source, while retrieval focuses on identifying which source it is. Using different models for each step proved to be more effective than combining both tasks into one.

The baseline methods we tested showed varied results. Some methods relied on finding patterns of co-occurrence between sources and speaking verbs, while others used more complex rules and syntactic analysis. We also explored approaches that utilize existing datasets to establish connections between sources and quotes.

For detection, we used a binary sentence classifier along with a document-wide embedding approach. For retrieval, we implemented methods that involve predicting tokens associated with sources, detecting spans within sentences, and generating open-ended responses to identify sources.

After evaluating the models, we found that the best-performing approach utilized a combination of advanced language models and our source detection methods, achieving a high accuracy rate.

Insights from Source Analysis

With a functioning attribution pipeline, we focused on learning how sources are used in news articles. We analyzed thousands of unlabeled documents to assess the extent to which articles attribute their information to sources and when these sources are typically used.

Our findings indicate that articles usually attribute around half of their sentences to sources, and this is consistent regardless of document length. However, the use of sources isn’t uniform: certain sources dominate, while others contribute less.

We also looked at how sources are added over time in articles. Initially, early versions often contain fewer sources, but as articles get updated, additional sources tend to be included consistently. This pattern suggests that understanding which sources are added can inform future recommendations for journalists.

Source Compositionality

An interesting question to explore is how certain sources are chosen to appear together in an article. We designed two approaches to tackle this question: ablation and NewsEdits.

In the ablation task, we systematically removed sources from articles and assessed how this affected the remaining content. The goal was to understand if the composition of sources was balanced or if certain sources were essential for the article's information.

The NewsEdits task focused on articles that had undergone changes. By examining version pairs of articles, we could see how many new sources were added over time and the relationships among them.

Our results showed that we could accurately predict when major sources were removed from articles, indicating that source usage follows a certain pattern. Major sources played a crucial role, while minor sources were less predictable.

Conclusion

In summary, our work provides a comprehensive overview of the sourcing habits in journalism. We developed an extensive dataset that captures a variety of source types and created models that can identify and attribute information effectively.

We believe our findings can help journalists improve their reporting by offering better tools to evaluate when and why sources are used in news articles. Moving forward, we hope to build a recommendation system that assists reporters in sourcing information.

Through this research, we aim to lay a foundation for further studies on the dynamics of source usage in news writing, paving the way for improvement in the quality and reliability of the news we consume.

The Role of Sources in News Reporting

A study on how sources shape news articles.

Source Attribution

Source Attribution Models

Insights from Source Analysis

Source Compositionality

Conclusion

Reference Links

Referenced Topics

The Role of Sources in News Reporting

A study on how sources shape news articles.

#Source Attribution

#Source Attribution Models

#Insights from Source Analysis

#Source Compositionality

#Conclusion

Reference Links

Referenced Topics

Source Attribution

Source Attribution Models

Insights from Source Analysis

Source Compositionality

Conclusion