User:AKlapper (WMF)/Bitergia data quality queries
Appearance
The data behind wikimedia.biterg.io regularly needs updates to make our metrics reliable. The database can be queried via the Sortinghat Identities API. The database can be edited via the Sortinghat Identities API and via the web interface.
For convenience this page lists GraphQL queries and bash scripts that User:AKlapper (WMF) may occasionally run.
Find accounts which likely should have an affiliation / enrollment
[edit]- By potential email address:
query { individuals(filters:{term: "@wikimedia.org", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "@wikimedia.de", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "@wikimedia.se", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "hallowelt", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "speedandfunction", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "thisdot.co", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }
- By potential username:
query { individuals(filters:{term: "(WMF)", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "-WMF", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "(WMDE)", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }query { individuals(filters:{term: "-WMDE", isEnrolled:false, isBot:false}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }
- Look at GitLab accounts and if they should get merged into existing accounts (very cumbersome, see phab:T306770, could manually check email addresses and/or group membership on https://ldap.toolforge.org/user/someusername but does not scale):
query { individuals(filters:{isEnrolled:false, isBot:false, source:"gitlab"}, page: 1, pageSize: 100) { pageInfo { page pageSize numPages hasNext totalResults } entities { mk profile { name email isBot } } } }
Queries not possible due to GraphQL limitations
[edit]- To identify folks that should have an affiliation set, use hostnames of email addresses of user accounts in the Phabricator database, then re-use those usernames as a condition in a GraphQL query on the Bitergia database.
- To find duplicate Phabricator accounts which only changed their "Also Known As" (as long as phab:T305230 remains unresolved): Query for
mks which share the very samenameand both havesource:"phabricator"but have differentmks.- Same applies to any other
sourcewhich allows renaming accounts.
- Same applies to any other
- To find accounts with same email addresses to merge: Query for
mks which share the very sameemailbut have differentmks.
Check detached accounts with same mw and phab usernames if they are connected to merge
[edit]Expensive / time-intense. See the script and DB commands.
Query all existing Phab accounts about their connected MediaWiki.org accounts
[edit]As of 2025 this is not easily possible. The old script orphab:T170091 do work anymore as we have no local database dump.
Notes on automated server-side merging / unifying and recommendations
[edit]Note that automatic affiliation assignment to an organization based on the email address only works when manually adding the email address to an identity (one single data source) but not when adding to a profile.