Wikimedia Research/Showcase/Archive/2023/06

From mediawiki.org

June 2023[edit]

Time
16.30 UTC: Find your local time here
Theme
Wikimedia and LGBTQIA+

June 21, 2023 Video: YouTube

Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia
By Chan Park, Carnegie Mellon University
Abstract: In this talk, I present our research on analyzing the portrayal of LGBT individuals in their biographies on Wikipedia, with a particular focus on subtle word connotations and cross-cultural comparisons. We aim to address two primary research questions: 1) How can we effectively measure the nuanced connotations of words in multilingual texts, which reflect sentiments, power dynamics, and agency? 2) How can we analyze the portrayal of a specific group, such as the LGBT community, and compare these portrayals across different languages? To answer these questions, we collect the Multilingual Contextualized Connotation Frames dataset, comprising 2,700 examples in English, Spanish, and Russian. We also develop a new multilingual model based on pre-trained multilingual language models. Additionally, we devise a matching algorithm to construct a comparison corpus for the target corpus, isolating the attribute of interest. Finally, we showcase how our developed models and constructed corpora enable us to conduct cross-cultural analysis of LGBT People Portrayals on Wikipedia. Our results reveal systematic differences in how the LGBT community is portrayed across languages, surfacing cultural differences in narratives and signs of social biases.


How do you represent my gender? Challenges and opportunities from the Wikidata Gender Diversity project
By Daniele Metilli, University College London
Abstract: Wikidata Gender Diversity (WiGeDi) is a one-year project funded through the Wikimedia Research Fund. The project is studying gender diversity in Wikidata, focusing on marginalized gender identities such as those of trans and non-binary people, and adopting a queer and intersectional feminist perspective. The project is organised in three strands — model, data, and community. First, we are looking at how the current Wikidata ontology model represents gender, and the extent to which this representation is inclusive of marginalized gender identities. We are analysing the data stored in the knowledge base to gather insights and identify possible gaps and biases. Finally, we are looking at how the community has handled the move towards the inclusion of a wider spectrum of gender identities by studying a corpus of user discussions through computational linguistics methods. This presentation will report on the current status of the Wikidata Gender Diversity project and the envisioned outcomes. We will discuss the main challenges that we are facing and the opportunities that our project will potentially enable, on Wikidata and beyond.