User:Fajne Farita

From mediawiki.org

A free-lance researcher and journalist. Interested in learning how language shapes thought and gives away lie.


Fajne's modest attempts to improve Wikipedia as a part of ORES team include:

  1. An investigation into mislabeled revisions in the ORES training set (Damaging/Badfaith). Turned out, of 100 edits 48 were mislabeled. Why?
  2. Active learning for ORES. Useful or nah?


Separately, Fajne worked with Wikipedia data while studying in the Information School, Berkeley. Something from that dark period:

  1. An attempt to build a pipeline of ML classifiers that would predict whether a Wikipedia article will be edited in the next 24 hours. Gigabytes of data, tons of features, feature importance analysis and 76% acc. on a balanced set and even worse in the real life. Here you go, Looking For a Needle in the Haystack.
  2. Can you tell a vandal edit from a regular one by how its grammar feels? Probabilistic context-free grammar features (PCFG) were supposed to help. As Fajne's work shows, grammar features rank from 12th to 72th in the feature importance rating in both Damaging and Goodfaith models, which means they can play an important role in the classification process but need more training data to become truly useful.
  3. In 2006, Wikipedia editors’ community underwent an accidental social experiment: the Wikimedia Foundation introduced their first anti-vandal automated tools. This experiment had different outcomes for the English and the Russian language communities. It is well known that after the new algorithm started catching newly joined editors on vandalism, often mistakenly, the English Wikipedia saw a sizable decline in their editor’s population. It is not widely known that the Russian Wikipedia was the only one among the Wikis that did not experience that problem. Do Russian pro bono editors differ from the English speaking ones? Is their motivation stronger? Does the Russian wiki-community provide a better support for the newbies? Is the new editors’ high survival rate caused by the social organizational structure of the Russian Wikipedia? Try to find the answers in What Does It Take to Be Pro Bono?