Manual:Pywikibot/pagegenerators.py

From MediaWiki.org
Jump to navigation Jump to search

Other languages:
català • ‎čeština • ‎Deutsch • ‎Ελληνικά • ‎English • ‎español • ‎suomi • ‎français • ‎italiano • ‎日本語 • ‎한국어 • ‎lietuvių • ‎Nederlands • ‎polski • ‎português do Brasil • ‎русский
Git logo

pagegenerators.py is a PywikibotManual:Pywikibot script used to generate list of pages for other scripts.

This module offers a wide variety of page generators. A page generator is an object that is iterable (see https://www.python.org/dev/peps/pep-0255/) and that yields page objects which other scripts can then use.

Command line usage[edit]

The pagegenerators.py may not be executed directly. Instead, the script listpages.pyManual:Pywikibot/listpages.py can be used.

Example:

python pwb.py listpages -search:'foobar'

This will return, in standard output, a list of all pages containing "foobar", as returned by MediaWiki's search engine.

See listpages.pyManual:Pywikibot/listpages.py for more details.

Calls from another script[edit]

Category crawler:

from pywikibot import pagegenerators

site = pywikibot.Site()
cat = pywikibot.Category(site, 'Category:Example')
pages = cat.articles()
for page in pagegenerators.PreloadingGenerator(pages, 100):
    # some treatment of generated pages

Subcategories explorer:

gen = pagegenerators.CategorizedPageGenerator(cat, recurse=True)

MySQL requests (see Manual:Pywikibot/MySQLManual:Pywikibot/MySQL):

gen = pagegenerators.MySQLPageGenerator(query)


Unicode recommendation[edit]

The following code returns KeyError: 'query' because of the special character:

gen = pagegenerators.SearchPageGenerator(u'´', namespaces = [0])

If searching in user and mediawiki namespaces, it would look like

gen = pagegenerators.SearchPageGenerator(u'´', namespaces = [2, 8])

Consequently, an encoding conversion is needed:

gen = pagegenerators.SearchPageGenerator("´", namespaces = [0])

See also[edit]