Simple Science

Cutting edge science explained simply

# Computer Science# Computer Vision and Pattern Recognition# Artificial Intelligence# Computation and Language

Understanding the Impact of TikTok's Sexual Content

Examining how TikTok videos affect youth attitudes toward sex.

― 5 min read


TikTok's Sexual ContentTikTok's Sexual ContentExplainedon youth perspectives.Analyzing the impact of TikTok's videos
Table of Contents

In recent years, TikTok has become a widely used platform, especially among young people. While it offers entertaining content, it can also expose viewers to sensitive topics, including sexual content. It is important to differentiate between videos meant to educate about Sex and those that are sexually suggestive. This distinction is crucial because children and teens are often affected differently by these types of content.

To help with this issue, a new dataset called SexTok has been created. This dataset includes TikTok videos labeled as sexually suggestive, Educational about sex, or neither. By having this information, we can better understand what kind of videos young people are watching and how they might impact their development.

The Need for Separation

Research has shown that children exposed to sexually suggestive content may develop unhealthy attitudes about sex. On the other hand, educational videos provide valuable information, particularly for LGBTQIA+ youth, who might not have access to appropriate guidance elsewhere. TikTok, with its vast reach, can play an important role in delivering sex education, accomplishing this in a way that feels private and inclusive.

However, the platform sometimes removes videos that fall into both categories, which can hurt the educational aspect. This is where the SexTok dataset comes in. It comprises both visually and audibly transcribed videos, making it easier to analyze and classify them.

Gathering Data

The data was collected by watching a number of TikTok videos and categorizing them accordingly. The dataset contains 1000 links to TikTok videos along with their labels. Each video was labeled based on its perceived intent: sexually suggestive, educational, or neither.

Additionally, the dataset includes Gender Expression labels which categorize how gender is presented in the videos. This feature is critical for evaluating any biases in the classification models that will later analyze the videos.

Classifying Videos

To understand the content of these videos better, two models were tested to classify the videos based on the dataset. Although the task seems simple at first, determining whether a video is educational or sexually suggestive is complex.

The first model focuses on the audio transcripts of the videos. It can capture the words being spoken and assess if they align more with educational content or suggestive content. The second model analyzes the videos themselves to gather visual information. Both approaches have their strengths and weaknesses, and combining them may provide the best results.

Challenges in Classification

Differentiating between educational videos and those that are suggestive is not straightforward. The subjective nature of sexual suggestiveness means that one person's view may differ from another's. The language used in the videos also plays a significant role. For example, euphemistic phrases can lead to confusion in categorization.

Moreover, some videos may not have spoken words at all, making audio analysis impossible. In those cases, the video content needs to be heavily analyzed for visual clues.

Results from Models

The initial results from the experiment show that while the task is challenging, it is possible to achieve satisfactory classification rates. The transformer-based models were able to successfully categorize a significant number of videos correctly. However, some videos caused confusion, particularly when they contained mixed signals-being both educational and suggestive at the same time.

The results indicated that text analysis, when available, is a strong indicator of educational content. However, suggestive videos tended to be shorter and often had music or other distractions that could mislead the classification.

Dataset Overview

The SexTok dataset includes a variety of videos, providing a more realistic portrayal of TikTok's content. It contains three main features: class label, gender expression, and audio transcription. The videos were collected from various sources within TikTok to allow for a diverse representation of content.

Gender Expression

Understanding gender expression in these videos is important for analyzing biases. Gender expression refers to how individuals show their gender identity through their appearance and behavior. The dataset categorizes gender expression into several labels: Feminine, Masculine, Non-conforming, Diverse, and None. This categorization can reveal potential patterns in how different types of content are presented visually.

Importance of Ethics

When collecting data, ethical considerations must be made. The videos in the dataset were viewed with the understanding that they are publicly accessible. However, there’s a risk of misrepresentation. The intent behind videos can be subjective, and this subjectivity should be acknowledged when analyzing the data.

The Role of Algorithms

Current algorithms used for content moderation on platforms like TikTok have shortcomings. They may misclassify videos that are not truly explicit. This misclassification can lead to valuable educational content being taken down, which is counterproductive for users seeking knowledge about sexual health.

The Future of Research

Further research is needed to refine the models and improve classification accuracy. Additionally, the implications of this work could lead to better educational resources being available on social media platforms. Addressing the biases in gender expression and how they affect content perception will also be significant for this line of research.

Conclusion

The development of the SexTok dataset is a step toward better understanding sexual content on platforms like TikTok. By separating sexually suggestive content from educational content, we can help create a safer and more informative space for young users. The findings from this research will aid in improving how videos are classified and moderated, ultimately leading to better access to sexual education for all users.

This ongoing exploration of video content and user interaction with sexual topics is crucial for the well-being of young people today. The conversations surrounding these topics are essential in paving the way for a more informed generation.

Original Source

Title: It is not Sexually Suggestive, It is Educative. Separating Sex Education from Suggestive Content on TikTok Videos

Abstract: We introduce SexTok, a multi-modal dataset composed of TikTok videos labeled as sexually suggestive (from the annotator's point of view), sex-educational content, or neither. Such a dataset is necessary to address the challenge of distinguishing between sexually suggestive content and virtual sex education videos on TikTok. Children's exposure to sexually suggestive videos has been shown to have adversarial effects on their development. Meanwhile, virtual sex education, especially on subjects that are more relevant to the LGBTQIA+ community, is very valuable. The platform's current system removes or penalizes some of both types of videos, even though they serve different purposes. Our dataset contains video URLs, and it is also audio transcribed. To validate its importance, we explore two transformer-based models for classifying the videos. Our preliminary results suggest that the task of distinguishing between these types of videos is learnable but challenging. These experiments suggest that this dataset is meaningful and invites further study on the subject.

Authors: Enfa George, Mihai Surdeanu

Last Update: 2023-07-06 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2307.03274

Source PDF: https://arxiv.org/pdf/2307.03274

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles