Topic on Talk:ORES review tool

Cogaidh (talkcontribs)

This tool seems to have a bias against IP editors, and it often marks undamaging edits as damaging. Many vandalizing edits are also unmarked. Is anyone else seeing this problem?

He7d3r (talkcontribs)

Could you clarify in which wiki you tested? (ORES uses a different model for each wiki)

Cogaidh (talkcontribs)

I'm using enwiki. It only seems to apply to IP addresses with the occasional registered user. However, I've been doing recent changes patrol for a few weeks and have only seen a handful of bluelinked users.

Cogaidh (talkcontribs)

@He7d3r, it's on enwiki. It's mostly IPs and redlinked users. I've only seen a few actual devoted users and even then it's usually a mistake.

He7d3r (talkcontribs)

@Halfak: do you know anything about this?

Halfak (WMF) (talkcontribs)

Hey He7d3r and UNSC Luke 1021. We know that this is a problem. It turns out that newcomers and IPs are the only ones who regularly vandalize, so the prediction is strongly weighted in that way. We addressed a substantial part of the problem in some past work. See m:Research_talk:Automated_classification_of_edit_quality/Work_log/2016-04-14 and https://www.youtube.com/watch?v=rsFmqYxtt9w#t=28m10s for a relevant talk that I gave on the subject.

Essentially the status is this: We minimized the bias using a new modeling strategy, but really what we need to do now is find new sources of signal for the prediction. I have two promising new strategies that I'm working to get implemented (PCFGs and Feature hashing). We have some operational issues with getting those deployed.

If you can share some examples of false positives/negatives and what you think is unusual/unfair about them, I can look into those examples specifically to make sure its a known issue and not something weird.

Cogaidh (talkcontribs)

Ok, I'll go through recent edits later and show you.

Cogaidh (talkcontribs)

Send some examples

Cogaidh (talkcontribs)

@Halfak (WMF): - I have an example: check diff (its the edit to the right that was marked as damaging)

Halfak (WMF) (talkcontribs)

It looks like this one right on the threshold of "damaging" but it does look like it will be reverted.

https://ores.wikimedia.org/v2/scores/enwiki/?revids=766011830&models=damaging%7Creverted

      "damaging": {
        "scores": {
          "766011830": {
            "prediction": false,
            "probability": {
              "false": 0.5047296304050086,
              "true": 0.4952703695949914
            }
          }
        }"
      },
      "reverted": {
        "scores": {
          "766011830": {
            "prediction": true,
            "probability": {
              "false": 0.2585393975879935,
              "true": 0.7414606024120065
            }
          }
        }
      }

OK. My hypothesis is that ORES is being weird about the excessively long comment. Let's experiement with changing that.

https://ores.wikimedia.org/v2/scores/enwiki/damaging/766011830?datasource.revision.comment=%22/*%20Crime%20*/%20Proper%20comma%20placing!%22

      "damaging": {
        "scores": {
          "766011830": {
            "prediction": false,
            "probability": {
              "false": 0.5047296304050086,
              "true": 0.4952703695949914
            }
          }
        }
      }

Hmm. That had no effect at all. OK let's check what would happen if this edit was saved by an editor who had registered a week ago.

https://ores.wikimedia.org/v2/scores/enwiki/damaging/766011830?feature.temporal.revision.user.seconds_since_registration=604800&feature.revision.user.is_anon=false

      "damaging": {
        "scores": {
          "766011830": {
            "prediction": false,
            "probability": {
              "false": 0.6003384342419845,
              "true": 0.3996615657580155
            }
          }
        }
      }

Yeah. That makes a big difference.

OK so here's what I think is going on. ORES' setting for catching most of the vandalism and other damage flags edits "for review" because it *might* be damaging. It's designed to help reviewer look at as little as possible for catching all of the damage. It seems that small edits (remove 2 ","s and add "as") performed by IP editors need review in order to make sure we catch all of the damage -- even though they often are just fine. I think this is what ORES is signalling and that it's not bad though it definitely could be better. In order to be able to differentiate good edits like this from bad, we need some natural language signal for the model. I'm hopeful our work with PCFGs will reduce the need to review these types of edits.

I know this doesn't really solve the problem for you now, but I hope it helps you understand why these false positives happen, and hopefully, it gave you a bit of insight into how ORES works and how you can experiment with its predictions. I often make use of this "feature injection" pattern to try to figure the "reason" behind the problematic judgements that ORES makes.

Reply to "IP Bias"