Topic on Talk:ORES

What changes in probabilities are significant?

5
Sebastian Berlin (WMSE) (talkcontribs)

As part of our work with the community wish list on SVWP , we're going to develop a gadget that gives an editor feedback using the article quality assessment. The idea is to show the quality before and after the edit. A question that has arisen is what changes to show and with what precision. Is it reasonable to show the difference for all probabilities, regardless of how small that difference is? My worry is that small changes to the probabilities may not be significant and could be misleading. Could someone give me some help with what changes are useful to show in a case like this?

EpochFail (talkcontribs)

I've developed a Javascript gadget that does something very similar to what you are planning. See https://en.wikipedia.org/wiki/User:EpochFail/ArticleQuality I wonder if we could make a modification to this tool to support what you are working on.

I've been using a "weighted sum" strategy to collapse the probabilities across classes into a single value. See this paper and the following code for an overview of how it works for English Wikipedia.

WEIGHTED_CLASSES = {FA: 6, GA: 5, B: 4, C: 3, Start: 2, Stub: 1}

weightedSum = function(score){
  var sum = 0
  for(var qualityClass in score.probability){
    if (!score.hasOwnProperty(qualityClass)) continue;
    var proba = score.probability[qualityClass]
    sum += proba * WEIGHTED_CLASSES[qualityClass]
  }
  return sum
}

This function returns a number between 1 and 6 that represents the model's prediction projected on a continuous scale.

Now, how big of a change matters? That's a good question and it's a hard one to answer. I think we'll learn in practice quite quickly once we have the model for svwiki.

Sebastian Berlin (WMSE) (talkcontribs)

That looks very interesting. I found (part of?) your script earlier, but I haven't had time to go figure out exactly what's going on there. I'll have a look and see what bits are reusable for this. I'd guess that the backend stuff (API interaction, weighting etc.) should be fairly similar.

I like the idea of just having one number to present to the user, along with the quality. From what I've understood, the quality levels aren't as evenly spaced on SVWP as on ENWP; it goes directly from Stub to equivalent to B. I don't know if and how this would impact the weighting algorithm, but maybe that will become apparent once it's in use.

EpochFail (talkcontribs)

We can have non-linear jumps in the scale. E.g. {Stub: 1, B: 4, GA: 5, FA: 6}

Claudiamuellerbirn (talkcontribs)

Dear all. I am not sure, if this thread is still active, however, I have a student working for an interface for representing the quality of Wikidata items. I would be happy to meet and to talk about it. We are based in Berlin :)

Reply to "What changes in probabilities are significant?"