Jump to content


From mediawiki.org

Welcome to Humaniki! You might be confused about what Humaniki is, but this page hopes to explain the answer to some common questions.

What is humaniki?[edit]

Humaniki provides statistics about the gender gap in the content of all Wikimedia projects based on data available on Wikidata. The data is available under the creative commons license and is free for anyone to use!

Why humaniki doesn’t reflect editing I did yesterday?[edit]

Although it would be useful, the queries that humaniki runs are too large for community SQL/SPARQL services. Instead to compute the statistics humaniki downloads the entire Wikidata dump and parses it with Wikidata Toolkit, this whole process takes about 3 days. Let's look at an empirical example, a dump file is created by Wikidata on to reflect data up to 2021-03-22, the process of creating that file takes many hours, and "landed" on dumps.wikimedia.org at 2021-03-24@12:53 (UTC). Humaniki runs every day to detect if a latest dump is available. Once it finds a new available dump, processing takes approximately 6 more hours, and logs show that new humaniki data going live at 2021-03-25@04:18. Because the data dumps are released once per week, and processing takes three days, depending on the time of week you edited, the minimum amount for data to be reflected in humaniki is 3 days, and the maximum is (7+3)=10 days.

How does humaniki align with the Wikimedia Movement’s 2030 strategy?[edit]

Knowledge Gaps Taxonomy (edited)

In response to Wikimedia Movement’s 2030 strategic direction, the Research team at the Wikimedia Foundation has developed a framework to understand and measure knowledge gaps. One of the three dimensions of knowledge gaps is content which is further broken down into diversity, accessibility and policy, with gender as one of the important factors in the content diversity dimension. Humaniki provides metrics for quantifying gender gap in Wikimedia Projects, and also makes available open data that can be accessed through its API.

I am new to humaniki, What all can I do with this tool?[edit]

Link to Demo Video: https://www.youtube.com/watch?v=0cbPWeJ8PiQ

How can I contribute?[edit]

1. Adding data labels for missing fields in Wikidata. 2.You can help with the project code and make extensions.

Please view our Contribution Guide to learn more.

What data it uses?[edit]

Humaniki Architecture

Humaniki uses Wikidata, the centralized knowledge base of Wikimedia projects, to generate statistics. It only imports data that has properties associated with humans and not otherwise.

How is gender structured in Wikidata[edit]

Wikidata is a free, collaborative, multilingual, secondary database, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other wikis of the Wikimedia movement, and to anyone in the world (Read more).

The Wikidata repository consists mainly of items, each one having a label, a description and any number of aliases. Items are uniquely identified by a Q followed by a number, such as Kamala Harris (Q10853588).

This diagram of a Wikidata item shows you some of the most important terms in Wikidata.
Item Property Value
Q10853588 P21 Q6581072
Kamala Harris sex or gender female

For a person (human), you can add a property to specify to their gender identity, by specifying a value for their sex or gender. As of February 2021, Wikidata has one property to represent both gender and sex of a human, P21 - sex or gender.

What is its history?[edit]

‘Humaniki’ is the merging of two previous Wikimedia data tools for diversity-focused editors - Wikidata Human Gender Indicators (WHGI) and Denelezh. Both of these previous projects enabled statistics about the biography gender gap in Wikimedia projects, but needed extra work to make those insights actionable for editors. This WMF-grant-funded project does that work by making available features identified from a participatory co-designing activities conducted with the editor community.

You can learn more about the process from our Humaniki launch blog post series.

  • Our first post outlined the project and the team introduction.
  • The second post outlined our user research approach and software development guidelines.
  • The third post outlined research findings and development plans for the minimal viable product.
  • The fourth post gave updates about our iterative design approach and delivery of final UI designs. It also provided details of the pre-production alpha launch.
  • The fifth and final blog post in this series showcased the new features covered in the alpha launch and discussed the project roadmap.

Do similar tools exist?[edit]

Several tools providing statistics about the content of Wikimedia projects exist (sorted here by year of release):

  • 2018 — WDCM Biases Dashboard, tracks the usage of Wikidata items (in how many pages each item is used in Wikimedia projects, not the number of sitelinks):
       gender gap by Wikimedia project
       gender gap by occupation
       gender gap by North-South divide

Who are the authors and what are the licenses of the images and logos used on humaniki?[edit]

Images and logos used on Humaniki have various authors and licenses. The full list is available on the Miscellaneous Credits File on the git repository of the project.

What is the roadmap?[edit]

(The project is open source. You can help us build this tool by working on any of the open tasks listed below)

Humaniki Project Process Flow
  1. Alpha stage (completed)
    1. Merge capabilities of WHGI and Denelezh
    2. Customizable Visualizations by enabling filter search
    3. Provide screenshot ready visualizations with the meta data information listed in the view that makes the tool presentation ready
    4. Provide data completeness information by highlighting proportions of articles about humans that don’t have gender/country/birthData info
  1. Beta stage:
    1. Internationalization to adapt software to different languages T274079
    2. Gender gap evolution trends T275332
    3. Generate occupation metrics T270046
    4. Further provide data completeness information for individual data fields in the table view e.g. articles missing gender in X country/year/wikiproject T275330
  1. Beyond:
    1. Support list-making T274085
    2. Enabling third party applications via data API T275328
      1. Anomaly detection T274088
    3. Maximizing the scope of currently collected gender gap data T275331
    4. Collecting other attributes of gender gap data (article quality, media files, others) T275339

How to use the API?[edit]

Humaniki provides a full open API that you may query. Please see github for more details