New Developers/Quarterly/2017-10

Overview
Goal, possible candidates, scope, etc.

Key findings

 * Finding 1
 * Finding 2
 * Finding 3

New developers metrics & trends (July - September 2017)

 * 1) In the previous 12 months (Jul 2016 - Jun 2017)
 * 2) Contributions
 * 3) New developers we attracted
 * 4) X new authors contributed to Y repositories
 * 5) New developers active one year after their first contribution
 * 6) X
 * 7) In the last quarter (July - September 2017)
 * 8) Contributions
 * 9) Number of new developers we attracted
 * 10) X new authors contributed to Y repositories
 * 11) Number of new developers who actively contributed
 * 12) * More than one changeset / contribs, etc.
 * 13) Number of commits we received and merged from new developers
 * 14) Project that received most contributions from new developers
 * 15) To which projects new developers mostly contribute:
 * 16) * mediawiki/extensions/examples X
 * 17) * mediawiki/core Y
 * 18) * apps/android/wikipedia Z
 * 19) What do we infer from this pattern? Anything?
 * 20) Documentation resources page views
 * 21) Number of developers who landed on our documentation pages targeted at newcomers. |Developer_hub|How_to_contribute|New_Developers Referral paths (page views)
 * 22) For pages like: how to contribute, new developers, how to become a MediaWiki hacker, etc.
 * 23) Developer outreach programs and events
 * 24) Number of new developers onboarded and retained at:
 * 25) Wikimedia Hackathon 2017 / Onboarded and retained
 * 26) Wikimania Hackathon 2017 / Onboarded
 * 27) Could consider tallying the email addresses and match them with the ones in the newcomers spreadsheet
 * 28) Outreachy Round 13 (Dec 2016 - Mar 2017) / Google Summer of Code 2016 / Onboarded and retained
 * 29) Commit pattern during events and programs

Survey
Brief..

Doubts
(Questions for Andre)

Time to get a Bitergia overview with a specific focus on topics mentioned below:

General About specific metrics Note to self: Andre asks:
 * For the new developers, how do we filter out the contributions made to third party repositories (as in T146135#3176718)?
 * Go to C_Gerrit_Demo and copy the second block from T146135 into the search bar. The results will not change much though, this is just a safety measure. And Andre should probably check the entries in that manual blacklist again.
 * Pulling email addresses.. Is it possible that Srishti understands this process as well? Or Srishti emails Andre 2-3 times in the month of September asking for more emails?
 * Andre to ask Bitergia a week before September ends for updated and complete quarter data, as the data on C_Gerrit_Demo is not yet automatically updated.
 * Number of commits we received and merged from new developers in the last quarter - how do we pull proposed commits that landed in Gerrit and the ones that got merged/landed in git from new developers
 * There is no way to do this on C_Gerrit_Demo. The only workaround is taking all the names of new authors from the "new Authors" list on C_Gerrit_Demo (CSV export) and constructing a query for the search field on https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit by entering author_name:"foo" OR author_name:"bar". Then use the "Status" circle in the middle by hovering your mouse pointer over it and take the "Count" numbers for NEW, ABANDONED, MERGED from there.
 * Number of new developers who actively contributed in the last quarter - (More than one commitset / patchset / contribs) etc. What is “contribs” in the "New Authors" widget on the C_Gerrit_Demo page?
 * contribs = changesets in Gerrit. Not patchsets within one changeset.
 * New developers active 1 year after their first contribution - How to calculate this?
 * By following the complicated steps on https://phabricator.wikimedia.org/T160430#3383647
 * Figure out a way to present stats / numbers
 * Under "Developer outreach programs and events > Number of new developers onboarded and retained at: > Wikimedia Hackathon 2017 / Onboarded and retained", how is 'retained' defined exactly? Check for any activity between 20170701 and 20170930 after Hackathon in 201705?
 * Yes, check for any activity at all between now and the hackathon (20170701 and 20170930)
 * Andre asks: All links above include any new developers, also staff, as the phrasing does not imply 'volunteers'. Should I changes those queries to only include author_org_name:"Independent" OR author_org_name:"Unknown"?
 * For a moment, I was about to say that it makes total sense to me and that exclude the staffers. But, I think if there are staffers new to Wikimedia code contribution process and trying to contribute in their volunteer time only, it would be good to include them in this study as well, unless you have a different opinion.