ORES/Newcomerquality

Purpose
ORES as a predictive algorithm can already predict the quality of single edits and articles, this project aims to extend that capability to sessions of multiple related edits. Being able to predict session quality paves the way for potential future tools such as automatically detecting promising new editors or edit wars on pages. Of course this idea is not new, since 2014 Snuggle has been trying to detect new editors that may have been bitten by vandal-fighters, but its infrastructure is reliant on pre-ORES technology, and is not easily generalizable. Continuing on that stream of work with ORES we plan to first research newcomer retention.

Labelling Campaigns

 * enwiki campaign w:en:Wikipedia:Labels/Newcomer_session_quality

Research Journal

 * November 16 2018 - Training with 30 features included revert ideas from qualitative research, such as edit war detection. Using precision at k=300 for metric. Shows that Logistic Regression and Gradient Boosting will do well. However performance is slightly diminshed when I subset only to non-singletons. So I want to label 100 more non-singleton edits and then finally pick the model.
 * https://github.com/notconfusing/newcomerquality/commit/db6f8811ab8ed029a73faefb10e3fc06ff0355d8