Simple Science

Cutting edge science explained simply

# Computer Science# Cryptography and Security

Privacy Labels: Are They Misleading Users?

A study reveals discrepancies between app privacy labels and actual data practices.

― 5 min read


Misleading Privacy LabelsMisleading Privacy LabelsExposeddisclosures.Study finds major gaps in app privacy
Table of Contents

Apple introduced Privacy Labels in December 2020 to help users understand the privacy practices of apps in their App Store. Developers are asked to report how their apps collect and use data. However, Apple does not verify these labels. Developers also provide Privacy Policies that describe their data practices in more detail. This study looks at how well privacy labels match up with what is stated in these privacy policies.

Data Collection and Analysis

In this study, we analyzed more than half a million (515,920) apps to compare their privacy labels and privacy policies. We used a tool called Polisis, which analyzes privacy policies using natural language processing (NLP). This tool helps to identify how apps collect and manage user data.

Our results showed significant gaps between what apps state in their privacy labels and their actual data practices as described in the privacy policies. We found that around 287,000 apps suggested they collect more data linked to users than what was reported in privacy labels. Alarmingly, 97% of apps that claimed no data was collected actually had privacy policies indicating otherwise.

Discrepancies Between Labels and Policies

Many developers used Templates to create their privacy policies, leading to inconsistencies. This study highlighted that developers often misunderstood Apple's definitions and requirements, which contributed to the inaccurate reporting of data collection practices.

Privacy policies are often filled with legal jargon, making it hard for users to understand how their data is being handled. On the other hand, the simpler privacy labels were designed to present information similar to food nutrition labels, making it easier for users to grasp the general data practices of apps.

Privacy labels ask developers to self-report their data collection practices. This includes details like the type of data collected, how it is used, and whether it is linked to user identities. Our analysis showed that privacy labels did not always reflect the reality of data practices described in the privacy policies.

Findings from the Analysis

Number of Apps Analyzed

We looked at 515,920 apps that had both privacy labels and privacy policies. About half of these apps had privacy policies accessible. As part of our study, we calculated the discrepancies in reported practices and found widespread Mislabeling.

Data Collection Practices

Most notably, we discovered that a significant number of apps claimed they did not collect any data when, in fact, their privacy policies indicated otherwise. This discrepancy raises concerns about user understanding of the risks of using these apps.

Different Types of Apps

In our analysis, we compared free and paid apps. Paid apps reported fewer privacy labels compared to free apps. However, the policies revealed a different story. Only 6% of paid apps claimed to collect user-linked data, while their policies suggested that 83% of these apps performed such collection.

Handling of Child Data

We also discovered that many apps, particularly those rated for age 4 and up, did not adequately address how they managed data collected from children. Although 81% of the analyzed apps had a self-assigned content rating of 4+, only 46% had policies in place to handle data collected from children.

The Role of Templates

Our study found that 58% of the apps evaluated likely used templates for their privacy policies. This could explain some of the discrepancies we observed. Privacy policy templates can save time and ensure legal compliance, but they can lead to generic statements that do not accurately represent an app's specific data practices.

Developers need to tailor templates to better reflect their actual practices. Using templates without customization can result in misleading privacy disclosures, making it difficult for users to understand what information is being collected and how it is used.

Case Studies of Selected Apps

To further illustrate these discrepancies, we conducted case studies of selected apps. We tested the apps to see how their stated data collection matched observed behaviors.

Example 1: Subsplash

Subsplash provides a platform for church services and requires multiple apps to use the same privacy policy. All linked apps indicated that they did not collect data linked to users. However, their privacy policy outlined a range of collected data, including personal information and location data, which contradicted their reported practices.

Example 2: ChowNow

ChowNow operates a food ordering platform. While their privacy policy indicated that they used personal data for managing user accounts and marketing, their privacy label suggested no data was collected. This inconsistency highlights the need for accurate labeling.

Example 3: Credit Karma

Credit Karma is a financial service app. Their privacy policy noted that they shared personal information with various entities, yet their privacy label did not report any data used for tracking. This contradiction further emphasizes the gap between stated practices and actual behavior.

Implications of Findings

The discrepancies we found have serious implications for users. They rely on privacy labels to make informed decisions about app usage. Misleading information can lead to misunderstandings about privacy risks, potentially exposing users to data tracking and misuse.

Recommendations for Improvement

Guidance for Developers

Developers should receive clearer guidance on how to accurately report their data practices. Improved training on Apple's definitions and expectations would help in correctly labeling their apps.

Improved Verification

Apple could implement a verification process for app privacy labels. Tools like Polisis could assist in performing initial checks to flag potential discrepancies before apps are published on the store.

Clearer Communication

Privacy labels need to be clearer and more informative. Apple should consider revising the language and structure of the labels to better represent the nuances of data sharing and collection practices.

Conclusion

The introduction of privacy labels was intended to empower users by providing transparent information about app data practices. However, our analysis shows significant misalignment between reported practices and actual policies, which could mislead users. Addressing these discrepancies is crucial to improving the privacy landscape in app usage. Ensuring accurate, clear, and user-friendly privacy disclosures will enhance user understanding and trust in the applications they use.

Original Source

Title: Honesty is the Best Policy: On the Accuracy of Apple Privacy Labels Compared to Apps' Privacy Policies

Abstract: Apple introduced privacy labels in Dec. 2020 as a way for developers to report the privacy behaviors of their apps. While Apple does not validate labels, they also require developers to provide a privacy policy, which offers an important comparison point. In this paper, we fine-tuned BERT-based language models to extract privacy policy features for 474,669 apps on the iOS App Store, comparing the output to the privacy labels. We identify discrepancies between the policies and the labels, particularly as they relate to data collected linked to users. We find that 228K apps' privacy policies may indicate data collection linked to users than what is reported in the privacy labels. More alarming, a large number (97%) of the apps with a Data Not Collected privacy label have a privacy policy indicating otherwise. We provide insights into potential sources for discrepancies, including the use of templates and confusion around Apple's definitions and requirements. These results suggest that significant work is still needed to help developers more accurately label their apps. Our system can be incorporated as a first-order check to inform developers when privacy labels are possibly misapplied.

Authors: Mir Masood Ali, David G. Balash, Monica Kodwani, Chris Kanich, Adam J. Aviv

Last Update: 2024-06-16 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2306.17063

Source PDF: https://arxiv.org/pdf/2306.17063

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

Reference Links

More from authors

Similar Articles