User:Slaporte/Article quality visualization

What we Have

 * Reference Count
 * Intro paragraph
 * Paragraph count
 * Image count
 * Category count
 * reference section count
 * external link section count
 * external link count
 * article assessment
 * google web search results
 * google news search results
 * page visits per day
 * likelihood of vandalism (from Wikitrust)
 * incoming
 * outgoing
 * number of editors
 * recency from last edit

Areas

 * structure
 * trustworthy
 * complete
 * objective

Formula

 * reference count / paragraph count                   ENOUGH REFERENCES
 * paragraph count / google news search result   SIGNIFICANCE
 * paragraph count / google web search result      SIGNIFIANCE
 * image count / paragraph count                       ENOUGH IMAGES
 * category count                                              ENOUGH CATEGORIES
 * unique editor count                                       EDIT HISTORY
 * time since last revision                                  EDIT HISTORY
 * assessment                                                 ASSESSMENT
 * feedback                                                      FEEDBACK
 * incoming links                                              INTERCONNECTION
 * outgoing links                                               INTERCONNECTION
 * incoming links / outgoing links                      INTERCONNECTION

to do
API dependency?
 * quality algorithm
 * visualization on page
 * batch processing, page history

brainstorming quality metrics
is the article at least a couple of paragraphs? is the article as long as it is important? (e.g. is it proportional to the number of results on google for the article's subject?) does the article have at least 1 picture for every n paragraphs? is the article in a category? does the article have at least 1 source for every n sentences? are there a large number of unique editors? are a good proportion of the editors users with long histories of editing articles? has the article been featured? are any of the paragraphs or sentences too long? are there any grammar or spelling errors? has the article been edited recently? how many flags does the article have? (e.g. neutrality, citation needed, weasel words, etc.) what are the user-created page ratings of the article?

Minimum requirements -- Y/N //would also be good to highlight which calls to action you want to encourage /* would apply to all non-stub article pages? */ One infobox $('.infobox').length One intro paragraph $('.mw-content-ltr p').length Three incoming links API: http://en.wikipedia.org/w/api.php?action=query&format=json&list=backlinks&bltitle=Charizard&bllimit=100&blnamespace=0 n images $('img').length n categories $('a[href*="/wiki/Category:"]').length More than one editor API: http://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&titles=Charizard&rvprop=user&rvlimit=500

not stub: if( !$('#siteSub').length ) return;

Content References $('.reference').length
 * by density (blah per n paragraphs/words etc)

Does it have references and external links section $('#References').length $('#External_links').length
 * by diversity (does it cover all the bases)
 * proper structure, e.g. does it follow http://en.wikipedia.org/wiki/Wikipedia:Style

Edits
 * by frequency/rate of edits (# edits/day, days since last edit)
 * by "demographics" of editors (total number of editors; percentage of editors that are registered; uniqueness; editor's; quality of editors)

Significance/External http://stats.grok.se/json/en/200804/Main_page
 * sources/links out
 * comparison to google search (position in results, number of results:length of article)
 * Google News API (no key required): http://ajax.googleapis.com/ajax/services/search/news?v=1.0&q=SOPA
 * number of instances of the wiki page in other languages
 * is http://en.wikichecker.com/ useful? too slow?
 * http://en.wikipedia.org/w/api.php?action=query&list=articlefeedback&afpageid=9228&afuserrating=1 < -- feedback
 * how many hits

Quality Assessment - Sometimes available on the article's talk page /* * @author Outriggr - created the script and used to maintain it * @author Pyrospirit - currently maintains and updates the script */   getRating: function getRating (text) { this.callHooks('getRating_before'); var rating = 'none'; if (text.match(/\|\s*(class|currentstatus)\s*=\s*fa\b/i)) rating = 'fa'; else if (text.match(/\|\s*(class|currentstatus)\s*=\s*fl\b/i)) rating = 'fl'; else if (text.match(/\|\s*class\s*=\s*a\b/i)) { if (text.match(/\|\s*class\s*=\s*ga\b|\|\s*currentstatus\s*=\s*(ffa\/)?ga\b/i)) rating = 'a/ga'; // A-class articles that are also GA's           else rating = 'a'; } else if (text.match(/\|\s*class\s*=\s*ga\b|\|\s*currentstatus\s*=\s*(ffa\/)?ga\b|\{\{\s*ga\s*\|/i)                  && !text.match(/\|\s*currentstatus\s*=\s*dga\b/i)) rating = 'ga'; else if (text.match(/\|\s*class\s*=\s*b\b/i)) rating = 'b'; else if (text.match(/\|\s*class\s*=\s*bplus\b/i)) rating = 'bplus'; // used by WP Math else if (text.match(/\|\s*class\s*=\s*c\b/i)) rating = 'c'; else if (text.match(/\|\s*class\s*=\s*start/i)) rating = 'start'; else if (text.match(/\|\s*class\s*=\s*stub/i)) rating = 'stub'; else if (text.match(/\|\s*class\s*=\s*list/i)) rating = 'list'; else if (text.match(/\|\s*class\s*=\s*sl/i)) rating = 'sl'; // used by WP Plants else if (text.match(/\|\s*class\s*=\s*(dab|disambig)/i)) rating = 'dab'; else if (text.match(/\|\s*class\s*=\s*cur(rent)?/i)) rating = 'cur'; else if (text.match(/\|\s*class\s*=\s*future/i)) rating = 'future'; this.callHooks('getRating_after'); return rating; }

Where the code goes
The eventual goal is to create a mediawiki 'gadget' which users can enable at https://www.mediawiki.org/wiki/Special:Preferences#mw-prefsection-gadgets

$('.reference').length

JS Fiddle: http://jsfiddle.net/MqfAZ/

JS Fiddle for UI fiddlin': http://jsfiddle.net/eSEFq/

Template for citation: $(".ambox-Refimprove:contains('citation')").length $('.ambox-Notability').length $('.ambox:contains("importance")').length $('.ambox:contains("advertisement")').length $('.ambox:contains("cleanup")').length $('.ambox:contains("confusing")').length $('.ombox:contains("deletion")').length $('.ambox:contains("quality standards")').length

$('.haudio').length