Topic on Talk:WMF product development process

An example of the need for quality assurance: Defective WMF harassment survey

4
Guy Macon (talkcontribs)

(Copied from Jimbo's talk page. My comments are below).

The mystery of the obviously incorrect "revenge porn" results of the WMF Harassment Survey has been solved on Wikipediocracy by Belgian poster Drijfzand... Basically, this survey of 3,845 Wikipedians across a range of WMF projects (45% of whom were from En-WP) generated 2,495 responses to a question asking whether they personally experienced harassment. Of these, 38% (about 948 people) said yes. (pg. 15). However, on page 17, in what is purported to be a breakdown of the forms of harassment experienced by these editors, an astounding 61% (about 578 people) are said to have claimed to be victims of "revenge porn." This, to anyone who ponders the number for more than 6 seconds, appears patently absurd — bearing in mind that the survey respondents were about 88% male and that the great majority of Wikipedians maintain some degree of anonymity.

Drijfzand observed that the number of responses for doxxing, revenge porn, hacking, impersonation, and threats of violence all fell within a range of 5% — which she or he said "simply can't happen." I theorized that the problem was a software glitch and Drijfzand identified the problem as a set of defective sliders in the survey form which refused to accept a value of 0, a bug identified by Burninthruthesky on November 3 and which was apparently remedied on November 4. LINK. Unfortunately, the survey was not launched on En-WP until Day 5 (to allow more responses from smaller Wikis so as to reduce the weight of the large projects, see pg. 2), meaning that bad data was generated on some projects for nearly a week. Whereas the survey should have been aborted and restarted, it apparently was not, and so the data presented on page 17 (and any conclusions derived therefrom) is a case of Garbage-In-Garbage-Out.

Once again: a failure to adequately beta-test software is evident. There is one saving grace, and that is we have a very good snapshot of the magnitude of the gender gap based on survey respondents (a ratio 88:12 for those who indicated a gender, with some 7 % of survey participants declining to respond). Assuming a heavier-than-average percentage of women than men in the "decline to respond group," this means we are probably in the ballpark of 86:14 or 85:15. There is also, for the first time ever as far as I am aware, a decent survey of age of Wikipedians. Your takeaway numbers: 35% of respondents (and presumably Wikipedians in general) are age 45 or over; only 24% are under the age of 25. All the fresh faces, many on travel grants, at Wikimania are deceiving — it appears that the median age of Wikipedians is right around 31 years old, give or take. So the expenditure on the harassment survey wasn't a total loss even if it failed at its intended mission (at least in part) due to bad software (leaving aside the very real question of sketchy survey design).

--Originally posted by User:Carrite at User talk:Jimbo Wales on the English Wikipedia at at 19:50, 3 February 2016 (UTC), reposted here by Guy Macon

-------------------------------------------------------------------------------

I would like to discuss the comment "Once again: a failure to adequately beta-test software is evident" in the above.

The WMF has many roles, and one of those roles is "software developer". One of our ongoing problems is the low quality of the software we develop. I would note that this is almost certainly not the fault of the individual developers or the managers one or two levels above them, but rather an institutional problem that flows down from top management. I would also note that top management almost certainly do not realize that they are the root cause of our low quality software.

The Wikipedia community has some extremely skilled project managers and software developers, but we have no way of helping the WMF to address this problem. I have personally tried every avenue that anyone suggested to try to get a technical proposal considered, (details available on request, but right now I am addressing the larger problem) but have been stonewalled. There should have been a way for me to get an answer, even if the answer was "no".

I would very much like to be able to report in a few months time that this has been solved and that the lines of communication are opening. Let's talk about how to make that happen. --~~~~

Qgil-WMF (talkcontribs)

@Guy Macon, testing software is a crucial activity, but I wonder which specific recommendations from this message we can extract for the product development process being discussed here. The survey mentioned was handled with Qualtrics, which is a product that we have not developed.

Guy Macon (talkcontribs)

Are you implying that using an external product relieves the WMF of the responsibility of doing some basic functionality testing of the software that handles a survey before releasing said software upon the Wikipedia community?

I would also note that the actual question I asked ("The Wikipedia community has some extremely skilled project managers and software developers, but we have no way of helping the WMF to address this problem. .. I would very much like to be able to report in a few months time that this has been solved and that the lines of communication are opening. Let's talk about how to make that happen.") has, as is our tradition, gone completely unanswered.

Whatamidoing (WMF) (talkcontribs)

I think that everyone agrees that QA testing is good. However:

  1. The software that was being used was not written by or supported by the WMF,
  2. the software itself was functioning correctly,
  3. the survey isn't a "product" and isn't being "developed", and
  4. the survey was not produced by or associated with any Product team.

Consequently, it's unclear to me why you think the Product department should address this, or how this error should affect the development of software products by the WMF Product department (=the subject of this page).

Or, to put it in terms that may be more familiar with Wikipedia editors: you have posted your comments at the {{wrong venue}}. (If you need help finding the right venue, then leave a note on my talk page.)

Reply to "An example of the need for quality assurance: Defective WMF harassment survey"