がん研究における性別予測ツール：研究

がん試験の著者における性別予測ツールの精度比較。

2025-09-06T10:04:30+00:00 ― 1 分で読む

研究方法
性別予測ツールの結果
名前のフォーマットによる性別予測精度
性別予測ツールのコストとアクセスビリティ
結論
オリジナルソース

最近の数年で、科学、技術、工学、数学（STEM）で最も目立つ問題の一つは、重要な役割に女性が少ないことだよね。この問題は特にがんの臨床試験で明らかで、重要な研究で女性のリーダーが少ないって報告されてる。最近10年間でもこのギャップは続いているみたい。

科学的な執筆における性別の違いを研究するために、研究者は時々著者の性別を限られた情報、特に名前から推測する必要があるんだ。名前から性別を予測するプログラムやサービスがあって、有名なものにはGenderizeやGender APIなどの商業サービス、そして無料ツールのgender Rパッケージがあるよ。

これらの予測ツールは、西洋文化でよく見られる名前にはうまく機能するけど、アジアの名前には苦労してるんだ。名前に特別な文字が含まれていたり、ハイフンを使っていたりすることなど、いろんな要因が精度に影響を与えることがあるんだ。特に二部構成の名前の書き方が性別の予測精度にどう影響するかについては、限られた研究しかないんだ。

この研究では、がん試験の研究者の大規模なデータベースを使って、いくつかの性別予測ツールの精度を比較したよ。研究者は著者の国籍に基づいて精度を分類し、韓国、中国、シンガポール、台湾などの国の名前に見られる二部構成の名前の書き方を評価した。

研究方法

テストした性別予測ツールは、Genderize、Gender API、gender Rパッケージの3つ。信頼できるがん臨床試験の著者の登録リストを使用して、その性別が分類されている著者たちの名前と所属を集めたよ。著者の性別クラスは生物的な違いではなく、社会的な構成に基づいているんだ。

名前は主に公表された臨床試験から集められ、特別な文字や誤字、二重姓といった問題を修正する徹底したプロセスを経たんだ。データベースに名前がない場合は、元の研究を確認してファーストネームの情報を探ったよ。性別は自動名一致、オンライン情報、ネイティブスピーカーとの話し合いを組み合わせて決定したんだ。不明な場合は、著者は「性別不明」としてマークされた。

性別ツールの予測精度は、知られているデータと比較して性別を正しく特定した頻度によって測定されたよ。また、不正確な予測や全く予測されなかった名前の割合も見たんだ。

性別が知られているすべてのトライアル参加者は、GenderizeとGender APIを使って分析され、Rパッケージは国情報なしで使われたよ。著者たちは二部構成の名前の書き方によって精度がどう変わるかをテストした。例えば、Jean-Pierreという名前をJean-Pierre、Jean Pierre、Jeanpierreなど色々な形式でテストしたんだ。

さらに、GenderizeとGender APIはそれぞれの予測に自信レベルも報告したよ。研究者たちは、これらの自信レベルが実際の精度とどれだけ一致しているかを見たんだ。

性別予測ツールの結果

がん登録者の4万人以上の中で、約3万7千人のファーストネームから性別を評価できたよ。フィルタリングの結果、最終グループには約3万3千人が含まれ、そのうち約35％が女性と特定された。ユニークなファーストネームは約8千あり、頻繁に現れるのはほんの一部だったんだ。

約2万5千人の研究者は所属が知られていて、ほぼ99％が一つの国に属してた。最も一般的な国はアメリカだったよ。研究によると、国情報が予測呼び出しに提供されなかった場合、Genderizeは約97％の高い精度を持っていたのに対し、Gender APIは約96％だった。このシナリオではRパッケージの精度はかなり低かった。

出身国が含まれると、Genderizeの精度は少し下がり、Gender APIの精度は少し上がった。分析から、二部構成の名前を扱う際、両方のツールは名前のフォーマットによって成功のレベルが異なることがわかったよ。実際、Genderizeは空白なしで二部構成の名前を組み合わせたときに最も正確で、Gender APIはスペースがある名前で最も良く機能したんだ。

名前のフォーマットによる性別予測精度

この研究は、各性別予測ツールが名前の書き方によって性別を予測する際に独自の反応を示すことを示したよ。Genderizeは、二部構成の名前がスペースや他のマークなしで結合されているときに最高の精度を達成した。一方、二部構成の名前がスペースで分けられていると、Genderizeは全く予測できなかったんだ。逆に、Gender APIは二部構成の名前の間にスペースが使われているときに優れていたよ。

全体的に、両方のツールはアジアの背景の名前に対して、西洋の名前よりも低い精度を示したんだ。これは驚くべきことではなく、過去の研究でも似たような課題が指摘されているから。とはいえ、最近の数年間には特に以前発表されたデータセットの再分析を通じて精度が向上しているように見える。

性別予測ツールのコストとアクセスビリティ

GenderizeやGender APIのような性別予測サービスは使いやすいプラットフォームを提供しているけど、価格が異なるんだ。Genderizeは毎日限られた数の予測を無料で提供しているのに対し、Gender APIは無料の提供がより制限されていて、サブスクリプションベースのアクセスは高いコストになるんだ。一般的に、Genderizeは大規模な評価においてGender APIよりも手頃な価格だよ。

結論

この分析を通じて、GenderizeとGender APIは全体的に性別を予測する際に印象的な精度を示した、特に一貫してフォーマットされた西洋の名前において。国情報が予測に含まれない場合、Genderizeはやや高い精度が見られた。しかし、国データが提供されると、Gender APIはより良いパフォーマンスを示したんだ。

両方のツールには、特にアジアの名前や複数部分の名前に関して重大な制限があり、今後の改善が必要だと強調されてる。また、彼らは性別の二元的理解のもとで動いていて、これは多様化する性別アイデンティティを反映していないんだ。今後、性別予測にもっと包括的なアプローチを取り入れることは、有効で尊重のある結果を保証するために重要だね。

この研究が示すように、性別予測ツールは有益な洞察を提供できるけど、完璧ではなく、いろんな課題があるんだ。これらのツールを効果的で包括的にするためには、継続的な評価と調整が必要だよ、特に研究の人口統計が時間とともに変わる中で。ここでの発見は、今後の研究にとっての基準を提供するだけでなく、より広い集団に効果的にサービスを提供するためにツールを洗練させる重要性を強調してる。

オリジナルソース

タイトル: Inferring Gender from First Names: Comparing the Accuracy of Genderize, Gender API, and the gender R Package on Authors of Diverse Nationality

概要: Meta-researchers commonly leverage tools that infer gender from first names, especially when studying gender disparities. However, tools vary in their accuracy, ease of use, and cost. The objective of this study was to compare the accuracy and cost of the commercial software Genderize and Gender API, and the open-source gender R package. Differences in binary gender prediction accuracy between the three services were evaluated. Gender prediction accuracy was tested on a multi-national dataset of 32,968 gender-labeled clinical trial authors. Additionally, two datasets from previous studies with 5779 and 6131 names, respectively, were re-evaluated with modern implementations of Genderize and Gender API. The gender inference accuracy of Genderize and Gender API were compared, both with and without supplying trialists country of origin in the API call. The accuracy of the gender R package was only evaluated without supplying countries of origin since. The accuracy of Genderize, Gender API, and the gender R package were defined as the percentage of correct gender predictions. Accuracy differences between methods were evaluated using McNemars test. Genderize and Gender API demonstrated overall 96.6% and 96.1% accuracy, respectively, when countries of origin were not supplied in the API calls. Genderize and Gender API achieved the highest accuracy when predicting the gender of German authors with accuracies greater than 98%. Genderize and Gender API were least accurate with South Korean, Chinese, Singaporean, and Taiwanese authors, demonstrating below 82% accuracy. The gender R package achieved below 86% accuracy on the full dataset. In the replication studies, Genderize and gender API demonstrated better performance than in the original publications. Our results indicate that Genderize and Gender API are highly accurate, except when evaluating South Korean, Chinese, Singaporean, and Taiwanese names. We also demonstrated that Genderize can provide similar accuracy to Gender API while being 4.85x less expensive. Author SummaryGender disparities in academia have prompted researchers to investigate gender gaps in professorship roles and publication authorship. Of particular concern are the gender gaps in cancer clinical trial authorship. Methodologies that evaluate gender disparities in academia often rely on tools that infer gender from first names. Tools that predict gender from first names are often used in methodologies that determine the gender ratios of academic departments or publishing authors in a discipline. However, researchers must choose between different gender predicting tools that vary in their accuracy, ease of use, and cost. We evaluated the binary gender prediction accuracy of Genderize, Gender API, and the gender R package on a gold-standard dataset of 32,968 clinical trialists from around the world. Genderize and Gender API cost money to use, while the gender R package is free and open source. We found that Genderize and Gender API were more accurate than the gender R package. In addition, Genderize is cheaper than Gender API, but is more sensitive to inconsistencies in name formatting and the presence of diacritical marks. Both Genderize and Gender API were most accurate with western names.