Sci Simple

New Science Research Articles Everyday

# Computer Science # Computation and Language

Shaping Language for All: The Gender-Fair Challenge

Promoting inclusivity through gender-fair language in writing and translation.

Simona Frenda, Andrea Piergentili, Beatrice Savoldi, Marco Madeddu, Martina Rosola, Silvia Casola, Chiara Ferrando, Viviana Patti, Matteo Negri, Luisa Bentivogli

― 6 min read


Language for Everyone Language for Everyone communication. Challenging gender bias in
Table of Contents

Gender-fair Language is all about making sure everyone feels included, no matter their gender. It’s about using words that don’t play favorites to one gender over another. This becomes especially tricky in languages like Italian, where words have specific gender forms. You might ask, "Why does this matter?" Well, words shape our thoughts, perceptions, and even our world. Using fair language helps everyone feel represented and valued.

The Gender-Fair Generation Challenge

To promote the use of gender-fair language, there is a challenge that focuses on identifying and transforming biased expressions in writing. This challenge has three key tasks:

  1. Finding Gendered Expressions: The first task is to spot those sneaky, gendered phrases lurking in sentences.
  2. Rewriting for Fairness: The second task is to creatively change those phrases into Gender-neutral alternatives.
  3. Fair Translation: The last task is to ensure Translations from English to Italian maintain gender neutrality when needed.

Let’s break down each of these tasks.

Task 1: Finding Gendered Expressions

In the first task, participants must identify phrases that are gender-specific in Italian sentences. For example, if someone says "i cittadini" to refer to a mixed-gender group, it’s not very inclusive. Instead of using masculine terms for a mixed group, we want phrases that recognize everyone.

So, the challenge is to train systems to spot phrases that highlight only one gender, whether overtly or subtly. This involves looking at various forms like:

  • Overextended Masculine/Feminine: Using one gendered term to describe many people, such as saying "the citizens" while only using the masculine form.
  • Generic Terms: Using masculine terms to refer to everyone, like "the candidate" meaning any candidate regardless of gender.
  • Incongruous Gender: Using a gender term that doesn’t match the person being referred to, like calling a woman "professore" (a masculine term).

Task 2: Rewriting for Fairness

After spotting gendered expressions in the first task, it’s time for a little creativity in the second task. Here, participants get to rewrite those phrases into gender-fair language. There are two main strategies:

  1. Conservative Obscuration: This approach avoids mentioning gender altogether. For instance, instead of saying "i professori" (the professors), one might say "il corpo docente" (the teaching body).

  2. Innovative Obscuration: This strategy introduces playful new terms that are gender-neutral. Imagine calling a professor "lǝ professorǝ" instead of using clearly male or female terms.

By turning gendered expressions into fair language, this task aims at making communication more inclusive.

Task 3: Fair Translation

The last task takes a bilingual approach. It looks at how well translations from English to Italian can keep things fair. Let’s say you have the sentence "I am glad to know such knowledgeable doctors." In this case, an ideal translation would be "Sono felice di conoscere un personale medico così preparato," which avoids emphasizing gender.

This task challenges systems to handle both gendered and gender-neutral translations appropriately. Sometimes, the gender is clear and should be used, while other times it should be avoided altogether. A delicate balance, much like trying to walk a tightrope while juggling!

Data Sets for the Challenge

To make this challenge happen, several data sets have been put together. Each one provides examples for models to learn from.

  1. GFL-it Corpus: This collection includes Italian texts from official documents. Annotators have marked the sections that contain gendered expressions, making it easier for models to learn what to look for.

  2. GeNTE: This bilingual test set helps with gender-neutral translations. It includes English sentences alongside gendered and gender-neutral Italian translations. The goal is to see if models can navigate between these forms correctly.

  3. Neo-GATE: Like GeNTE, this set focuses on innovative gender-neutral terms. It includes English sentences that don't give away gender, allowing for creative Italian translations.

These data sets are essential for training systems and improving their understanding of gender-fair language.

Evaluating the Models

As participants engage in the tasks, their results are evaluated against specific criteria. For task 1, models are scored based on their ability to identify gendered expressions using a method called F1-score, which measures accuracy. The more correct matches with the annotations, the better.

For task 2, the focus shifts to a classifier that determines if the reformulated sentences are gender-neutral or not. The percentage of correct labels helps assess performance.

In task 3, the emphasis is again on translations. The models need to decide when to use gendered terms and when to stick with neutral language. This helps keep the conversation fair and inclusive.

Limitations of the Challenge

While the challenge is designed to promote gender-fair language, it's not without its flaws. The data sets only encompass specific areas, like official documents or specific institutional contexts. This means future research could benefit from having a wider range of sources and perspectives.

Moreover, the current approaches to metrics and evaluation might only be the beginning. More refined methods should be explored to assess models thoroughly. There's also the fact that only one kind of gender-neutral term is used, such as the schwa-simple. A world of possibilities exists for expressing gender-neutral ideas.

Ethical Considerations

The challenge raises important ethical questions. By working toward reducing gender-biased language, the aim is to elevate the voices of those who are often overlooked. But the team behind this effort acknowledges their shortcomings, such as having an imbalance in their group of annotators.

Also, there’s a valid concern about accessibility. Some people might find it challenging to read terms employing innovative gender-neutral markers, especially those with reading difficulties. However, there's room for flexibility. Individuals can choose which terms work best for them, allowing for a more user-friendly experience.

The Schwa-Simple Paradigm

One creative tool in the toolbox of gender-neutral language is the schwa-simple paradigm. This method replaces traditional gendered terms with a placeholder, offering flexibility. Here's how it works:

  • Masculine terms like "professore" can be replaced with "professorǝ" to include everyone, whether they're male, female, or non-binary.
  • The paradigm includes a variety of forms to cover many situations, providing options that can be tailored to different contexts.

This paradigm is a playful way to challenge conventional language norms and inspire inclusivity.

Conclusion

The push for gender-fair language is more than just a trendy topic; it's a significant movement toward inclusivity and representation. By identifying, rewriting, and translating language to be fair to all genders, we are helping to shape a world where everyone feels acknowledged and valued.

In a nutshell, this challenge aims to break down barriers in language and create a more equitable communication space. And while challenges remain, the progress made is a step in the right direction. Who knew words could make such a big difference?

Original Source

Title: GFG -- Gender-Fair Generation: A CALAMITA Challenge

Abstract: Gender-fair language aims at promoting gender equality by using terms and expressions that include all identities and avoid reinforcing gender stereotypes. Implementing gender-fair strategies is particularly challenging in heavily gender-marked languages, such as Italian. To address this, the Gender-Fair Generation challenge intends to help shift toward gender-fair language in written communication. The challenge, designed to assess and monitor the recognition and generation of gender-fair language in both mono- and cross-lingual scenarios, includes three tasks: (1) the detection of gendered expressions in Italian sentences, (2) the reformulation of gendered expressions into gender-fair alternatives, and (3) the generation of gender-fair language in automatic translation from English to Italian. The challenge relies on three different annotated datasets: the GFL-it corpus, which contains Italian texts extracted from administrative documents provided by the University of Brescia; GeNTE, a bilingual test set for gender-neutral rewriting and translation built upon a subset of the Europarl dataset; and Neo-GATE, a bilingual test set designed to assess the use of non-binary neomorphemes in Italian for both fair formulation and translation tasks. Finally, each task is evaluated with specific metrics: average of F1-score obtained by means of BERTScore computed on each entry of the datasets for task 1, an accuracy measured with a gender-neutral classifier, and a coverage-weighted accuracy for tasks 2 and 3.

Authors: Simona Frenda, Andrea Piergentili, Beatrice Savoldi, Marco Madeddu, Martina Rosola, Silvia Casola, Chiara Ferrando, Viviana Patti, Matteo Negri, Luisa Bentivogli

Last Update: 2024-12-30 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2412.19168

Source PDF: https://arxiv.org/pdf/2412.19168

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles