Manual:Pywikibot/listpages.py

Print a list of pages, as defined by page generator parameters.

Optionally, it also prints page content to STDOUT or save it to a file in the current directory.

These parameters are supported to specify which pages titles to print:

-format Defines the output format. Can be a custom string according to python string.format notation or        can be selected by a number from following list (1 is default format): 1 - u'{num:4d} {page.title}' --> 10 PageTitle

2 - u'{num:4d} {page.title}' --> 10 PageTitle

3 - u'{page.title}' --> PageTitle

4 - u'{page.title}' --> PageTitle

5 - u'{num:4d} \03{page.loc_title:<40}\03' --> 10 PageTitle (colorised in lightred)

6 - u'{num:4d} {page.loc_title:<40} {page.can_title:<40}' --> 10 localised_Namespace:PageTitle canonical_Namespace:PageTitle

7 - u'{num:4d} {page.loc_title:<40} {page.trs_title:<40}' --> 10 localised_Namespace:PageTitle outputlang_Namespace:PageTitle (*) requires "outputlang:lang" set.

num is the sequential number of the listed page. -outputlang  Language for translation of namespaces. -notitle Page title is not printed. -get    Page content is printed. -save   Save Page content to a file named as page.title(as_filename=True). Directory can be set with -save:dir_name If no dir is specified, current direcory will be used. -encode File encoding can be specified with '-encode:name' (name must be a                                                         valid python encoding: utf-8, etc.). If not specified, it defaults to config.textfile_encoding. -put:   Save the list to the defined page of the wiki. By default it does not overwrite an existing page. -overwrite   Overwrite the page if it exists. Can only by applied with -put. -summary:    The summary text when the page is written. If it's one word just containing letters, dashes and underscores it uses that as a                                                              translation key. Custom format can be applied to the following items extrapolated from a page object: site: obtained from page._link._site.

title: obtained from page._link._title.

loc_title: obtained from page._link.canonical_title.

can_title: obtained from page._link.ns_title. based either the canonical namespace name or on the namespace name in the language specified by the -trans param; a default value '******' will be used if no ns is found.

onsite: obtained from pywikibot.Site(outputlang, self.site.family).

trs_title: obtained from page._link.ns_title(onsite=onsite). If selected format requires trs_title, outputlang must be set.

Parameters
-catfilter       Filter the page generator to only yield pages in the specified category. See -cat for argument format.

-cat             Work on all pages which are in a specific category. Argument can also be given as "-cat:categoryname" or                 as "-cat:categoryname|fromtitle" (using # instead of |                  is also allowed in this one and the following)

-catr            Like -cat, but also recursively includes pages in subcategories, sub-subcategories etc. of the given category. Argument can also be given as "-catr:categoryname" or as                   "-catr:categoryname|fromtitle".

-subcats         Work on all subcategories of a specific category. Argument can also be given as "-subcats:categoryname" or                 as "-subcats:categoryname|fromtitle".

-subcatsr        Like -subcats, but also includes sub-subcategories etc. of                  the given category. Argument can also be given as "-subcatsr:categoryname" or                 as "-subcatsr:categoryname|fromtitle".

-uncat           Work on all pages which are not categorised.

-uncatcat        Work on all categories which are not categorised.

-uncatfiles      Work on all files which are not categorised.

-file            Read a list of pages to treat from the named text file. Page titles in the file may be either enclosed with brackets, or be separated by new lines. Argument can also be given as "-file:filename".

-filelinks       Work on all pages that use a certain image/media file. Argument can also be given as "-filelinks:filename".

-search          Work on all pages that are found in a MediaWiki search across all namespaces.

-logevents       Work on articles that were on a specified Special:Log. The value may be a comma separated list of three values:

logevent,username,total

To use the default value, use an empty string. You have options for every type of logs given by the log event parameter which could be one of the following:

block, protect, rights, delete, upload, move, import, patrol, merge, suppress, review, stable, gblblock, renameuser, globalauth, gblrights, abusefilter, newusers

It uses the default number of pages 10.

Examples:

-logevents:move gives pages from move log (usually redirects) -logevents:delete,,20 gives 20 pages from deletion log -logevents:protect,Usr gives pages from protect by user Usr -logevents:patrol,Usr,20 gives 20 patroled pages by user Usr

In some cases it must be written as -logevents:"patrol,Usr,20"

-namespaces      Filter the page generator to only yield pages in the -namespace       specified namespaces. Separate multiple namespace -ns              numbers or names with commas. Examples:

-ns:0,2,4 -ns:Help,MediaWiki

If used with -newpages/-random/-randomredirect, -namespace/ns must be provided before -newpages/-random/-randomredirect. If used with -recentchanges, efficiency is improved if                 -namespace/ns is provided before -recentchanges.

If used with -start, -namespace/ns shall contain only one value.

-interwiki       Work on the given page and all equivalent pages in other languages. This can, for example, be used to fight multi-site spamming. Attention: this will cause the bot to modify pages on several wiki sites, this is not well tested, so check your edits!

-limit:n         When used with any other argument that specifies a set of pages, work on no more than n pages in total.

-links           Work on all pages that are linked from a certain page. Argument can also be given as "-links:linkingpagetitle".

-liverecentchanges Work on pages from the live recent changes feed. If used as                 -liverecentchanges:x, work on x recent changes.

-imagesused      Work on all images that contained on a certain page. Argument can also be given as "-imagesused:linkingpagetitle".

-newimages       If given as -newimages:x, it will work on the x newest images. Otherwise asks to input the number of wanted images.

-newpages        Work on the most recent new pages. If given as -newpages:x, will work on the x newest pages.

-recentchanges   Work on the pages with the most recent changes. If                 given as -recentchanges:x, will work on the x most recently changed pages.

-unconnectedpages Work on the most recent unconnected pages to the Wikibase repository. Given as -unconnectedpages:x, will work on the x most recent unconnected pages.

-ref             Work on all pages that link to a certain page. Argument can also be given as "-ref:referredpagetitle".

-start           Specifies that the robot should go alphabetically through all pages on the home wiki, starting at the named page. Argument can also be given as "-start:pagetitle".

You can also include a namespace. For example, "-start:Template:!" will make the bot work on all pages in the template namespace.

default value is start:!

-prefixindex     Work on pages commencing with a common prefix.

-step:n          When used with any other argument that specifies a set of pages, only retrieve n pages at a time from the wiki server.

-subpage:n       Filters pages to only those that have depth n                  i.e. a depth of 0 filters out all pages that are subpages, and a depth of 1 filters out all pages that are subpages of subpages.

-titleregex      A regular expression that needs to match the article title otherwise the page won't be returned. Multiple -titleregex:regexpr can be provided and the page will be returned if title is matched by any of the regexpr provided. Case insensitive regular expressions will be used and dot matches any character.

-transcludes     Work on all pages that use a certain template. Argument can also be given as "-transcludes:Title".

-unusedfiles     Work on all description pages of images/media files that are not used anywhere. Argument can be given as "-unusedfiles:n" where n is the maximum number of articles to work on.

-lonelypages     Work on all articles that are not linked from any other article. Argument can be given as "-lonelypages:n" where n is the maximum number of articles to work on.

-unwatched       Work on all articles that are not watched by anyone. Argument can be given as "-unwatched:n" where n is the maximum number of articles to work on.

-usercontribs    Work on all articles that were edited by a certain user. (Example : -usercontribs:DumZiBoT)

-weblink         Work on all articles that contain an external link to                  a given URL; may be given as "-weblink:url"

-withoutinterwiki Work on all pages that don't have interlanguage links. Argument can be given as "-withoutinterwiki:n" where n is the total to fetch.

-mysqlquery      Takes a Mysql query string like "SELECT page_namespace, page_title, FROM page                 WHERE page_namespace = 0" and works on the resulting pages.

-wikidataquery   Takes a WikidataQuery query string like claim[31:12280] and works on the resulting pages.

-searchitem      Takes a search string and works on Wikibase pages that contain it. Argument can be given as "-searchitem:text", where text is the string to look for, or "-searchitem:lang:text", where lang is the language to search items in.

-random          Work on random pages returned by Special:Random. Can also be given as "-random:n" where n is the number of pages to be returned, otherwise the default is 10 pages.

-randomredirect  Work on random redirect pages returned by                  Special:RandomRedirect. Can also be given as                 "-randomredirect:n" where n is the number of pages to be                  returned, else 10 pages are returned.

-untagged        Work on image pages that don't have any license template on a                  site given in the format " . .org, e.g.                  "ja.wikipedia.org" or "commons.wikimedia.org".                  Using an external Toolserver tool.

-google          Work on all pages that are found in a Google search. You need a Google Web API license key. Note that Google doesn't give out license keys anymore. See google_key in                 config.py for instructions. Argument can also be given as "-google:searchstring".

-yahoo           Work on all pages that are found in a Yahoo search. Depends on python module pYsearch. See yahoo_appid in                 config.py for instructions.

-page            Work on a single page. Argument can also be given as                 "-page:pagetitle", and supplied multiple times for multiple pages.

-grep            A regular expression that needs to match the article otherwise the page won't be returned. Multiple -grep:regexpr can be provided and the page will be returned if content is matched by any of the regexpr provided. Case insensitive regular expressions will be used and dot matches any character, including a newline.

-ql              Filter pages based on page quality. This is only applicable if contentmodel equals 'proofread-page', otherwise has no effects. Valid values are in range 0-4. Multiple values can be comma-separated.

-onlyif          A claim the page needs to contain, otherwise the item won't                  be returned. The format is property=value,qualifier=value. Multiple (or                 none) qualifiers can be passed, separated by commas. Examples: P1=Q2 (property P1 must contain value Q2), P3=Q4,P5=Q6,P6=Q7 (property P3 with value Q4 and                 qualifiers: P5 with value Q6 and P6 with value Q7). Value can be page ID, coordinate in format: latitude,longitude[,precision] (all values are in decimal                 degrees), year, or plain string. The argument can be provided multiple times and the item page will be returned only if all of the claims are present. Argument can be also given as "-onlyif:expression".

-onlyifnot       A claim the page must not contain, otherwise the item won't                  be returned. For usage and examples, see -onlyif above.

-intersect       Work on the intersection of all the provided generators.