Manual:Pywikibot/cosmetic changes.py

From MediaWiki.org
Jump to: navigation, search
Language: English  • русский
Wikimedia-logo-meta.png

This page was moved from MetaWiki.
It probably requires cleanup – please feel free to help out. In addition, some links on the page may be red; respective pages might be found at Meta. Remove this template once cleanup is complete.

Bug blank.svg
Wikimedia Git repository has this file:

cosmetic_changes.py makes little changes in one or several wiki pages that make the source code look cleaner. These changes are not supposed to change the look of the rendered wiki pages.

Before running it on a specific wiki, each sub-module should be reviewed and checked if it is helpful for this wiki.

The script may be used in two ways

  1. cosmetic_changes.py [options] for a standalone use
  2. for regular use, it is recommended to put cosmetic_changes = True into your user-config.py

This script performs the following operations:[1]

  • fix self interwiki: interwiki links to the site itself are displayed like local links, removing their language code prefix.
  • standardize page footer: makes sure that categories, stars templates for featured articles and interwiki links are put to the correct position. templates and interwiki links are sorted into the right order.
  • clean up links: improves how links are presented by doing the following:
    • replaces underlines by spaces, also multiple underlines.
    • removes unnecessary leading spaces from a title.
    • removes unnecessary trailing spaces from a title.
    • converts URL-encoded characters to unicode.
    • removes unnecessary initial and final spaces from a label.
    • tries to capitalize the first letter of the title.
  • clean up section headers: for better readability of section header source code, puts a space between the equal signs and the title; for example: "==Section title==" becomes "== Section title ==" .
  • put spaces in lists: for better readability of bullet list and enumeration wiki source code, puts a space between the * or # and the text.
  • translate and capitalize namespaces: makes sure that localized namespace names are used. Does not change "image" alias on en-wiki or fr-wiki.
  • resolve html entities
  • valid xhtml: tries to make a valid XHTML document, for example, replacing "<br>" with "<br />"
  • remove useless spaces
  • remove non breaking space before percent: newer MediaWiki versions automatically place a non-breaking space in front of a percent sign, so it is no longer required to place it manually.
  • fix syntax: correct mediawiki syntax for external links
  • fix HTML: translates some HTML-entities into the corresponding mediawiki syntax; remove unneeded <ref /> tag.
  • fix style: convert prettytable to wikitable class (de-wiki and en-wiki only).
  • fix typo: change ccm with preleading number to cm³; change º to ° if it concerns degree Celsius or Fahrenheit
  • hyphenate isbn numbers: tries to hyphenate an ISBN

Parameters[edit | edit source]

 -cat              Work on all pages which are in a specific category.
                  Argument can also be given as "-cat:categoryname" or
                  as "-cat:categoryname|fromtitle" (using # instead of |
                  is also allowed in this one and the following)

-catr             Like -cat, but also recursively includes pages in
                  subcategories, sub-subcategories etc. of the
                  given category.
                  Argument can also be given as "-catr:categoryname" or
                  as "-catr:categoryname|fromtitle".

-subcats          Work on all subcategories of a specific category.
                  Argument can also be given as "-subcats:categoryname" or
                  as "-subcats:categoryname|fromtitle".

-subcatsr         Like -subcats, but also includes sub-subcategories etc. of
                  the given category.
                  Argument can also be given as "-subcatsr:categoryname" or
                  as "-subcatsr:categoryname|fromtitle".

-uncat            Work on all pages which are not categorised.

-uncatcat         Work on all categories which are not categorised.

-uncatfiles       Work on all files which are not categorised.

-file             Read a list of pages to treat from the named text file.
                  Page titles in the file must be enclosed with [[brackets]].
                  Argument can also be given as "-file:filename".

-filelinks        Work on all pages that use a certain image/media file.
                  Argument can also be given as "-filelinks:filename".

-search           Work on all pages that are found in a MediaWiki search
                  across all namespaces.

-namespaces       Filter the page generator to only yield pages in the
-namespace        specified namespaces. Separate multiple namespace
-ns               numbers with commas. Example "-ns:0,2,4"
                  If used with -newpages, -namepace/ns must be provided
                  before -newpages.
                  If used with -recentchanges, efficiency is improved if
                  -namepace/ns is provided before -recentchanges.

-interwiki        Work on the given page and all equivalent pages in other
                  languages. This can, for example, be used to fight
                  multi-site spamming.
                  Attention: this will cause the bot to modify
                  pages on several wiki sites, this is not well tested,
                  so check your edits!

-limit:n          When used with any other argument that specifies a set
                  of pages, work on no more than n pages in total

-links            Work on all pages that are linked from a certain page.
                  Argument can also be given as "-links:linkingpagetitle".

-imagesused       Work on all images that contained on a certain page.
                  Argument can also be given as "-imagesused:linkingpagetitle".

-newimages        Work on the 100 newest images. If given as -newimages:x,
                  will work on the x newest images.

-newpages         Work on the most recent new pages. If given as -newpages:x,
                  will work on the x newest pages.

-recentchanges    Work on the pages with the most recent changes. If
                  given as -recentchanges:x, will work on the x most recently
                  changed pages.

-ref              Work on all pages that link to a certain page.
                  Argument can also be given as "-ref:referredpagetitle".

-start            Specifies that the robot should go alphabetically through
                  all pages on the home wiki, starting at the named page.
                  Argument can also be given as "-start:pagetitle".

                  You can also include a namespace. For example,
                  "-start:Template:!" will make the bot work on all pages
                  in the template namespace.

-prefixindex      Work on pages commencing with a common prefix.

-step:n           When used with any other argument that specifies a set
                  of pages, only retrieve n pages at a time from the wiki
                  server

-titleregex       Work on titles that match the given regular expression.

-transcludes      Work on all pages that use a certain template.
                  Argument can also be given as "-transcludes:Title".

-unusedfiles      Work on all description pages of images/media files that are
                  not used anywhere.
                  Argument can be given as "-unusedfiles:n" where
                  n is the maximum number of articles to work on.

-lonelypages      Work on all articles that are not linked from any other article.
                  Argument can be given as "-lonelypages:n" where
                  n is the maximum number of articles to work on.

-unwatched        Work on all articles that are not watched by anyone.
                  Argument can be given as "-unwatched:n" where
                  n is the maximum number of articles to work on.

-usercontribs     Work on all articles that were edited by a certain user :
                  Example : -usercontribs:DumZiBoT

-weblink          Work on all articles that contain an external link to
                  a given URL; may be given as "-weblink:url"

-withoutinterwiki Work on all pages that don't have interlanguage links.
                  Argument can be given as "-withoutinterwiki:n" where
                  n is some number (??).

-mysqlquery       Takes a Mysql query string like
                  "SELECT page_namespace, page_title, FROM page
                  WHERE page_namespace = 0" and works on the resulting pages.

-wikidataquery    Takes a WikidataQuery query string like claim[31:12280]
                  and works on the resulting pages.

-random           Work on random pages returned by [[Special:Random]].
                  Can also be given as "-random:n" where n is the number
                  of pages to be returned, otherwise the default is 10 pages.

-randomredirect   Work on random redirect pages returned by
                  [[Special:RandomRedirect]]. Can also be given as
                  "-randomredirect:n" where n is the number of pages to be
                  returned, else 10 pages are returned.

-untagged         Work on image pages that don't have any license template on a
                  site given in the format "<language>.<project>.org, e.g.
                  "ja.wikipedia.org" or "commons.wikimedia.org".
                  Using an external Toolserver tool.

-google           Work on all pages that are found in a Google search.
                  You need a Google Web API license key. Note that Google
                  doesn't give out license keys anymore. See google_key in
                  config.py for instructions.
                  Argument can also be given as "-google:searchstring".

-yahoo            Work on all pages that are found in a Yahoo search.
                  Depends on python module pYsearch.  See yahoo_appid in
                  config.py for instructions.

-page             Work on a single page. Argument can also be given as
                  "-page:pagetitle".

-grep             A regular expression that needs to match the article
                  otherwise the page won't be returned.
                  Multiple -grep:regexpr can be provided and the page will
                  be returned if content is matched by any of the regexpr
                  provided.
                  Case insensitive regular expressions will be used and
                  dot matches any character, including a newline.

-always           Don't prompt you for each replacement. Warning (see below)
                  has not to be confirmed. ATTENTION: Use this with care!

-async            Put page on queue to be saved to wiki asynchronously.

-summary:XYZ      Set the summary message text for the edit to XYZ, bypassing
                  the predefined message texts with original and replacements
                  inserted.

-ignore:          Ignores if an error occured and either skips the page or
                  only that method. It can be set to 'page' or 'method'.

References[edit | edit source]

  1. source code, function "change"

See also[edit | edit source]