Manual:Pywikibot/pagegenerators.py

From MediaWiki.org
Jump to navigation Jump to search
This page is a translated version of the page Manual:Pywikibot/pagegenerators.py and the translation is 17% complete.

Other languages:
Deutsch • ‎English • ‎Nederlands • ‎català • ‎español • ‎français • ‎italiano • ‎lietuvių • ‎polski • ‎português do Brasil • ‎suomi • ‎čeština • ‎Ελληνικά • ‎русский • ‎日本語 • ‎한국어
Git logo

pagegenerators.py is a Pywikibot script used to generate list of pages for other scripts.

This module offers a wide variety of page generators. A page generator is an object that is iterable (see https://www.python.org/dev/peps/pep-0255/) and that yields page objects which other scripts can then use.

Uso en línea de comandos

The pagegenerators.py may not be executed directly. Instead, the script listpages.py can be used.

Ejemplo:

python pwb.py listpages -search:'foobar'

This will return, in standard output, a list of all pages containing "foobar", as returned by MediaWiki's search engine.

See listpages.py for more details.

Calls from another script

Category crawler:

from pywikibot import pagegenerators

site = pywikibot.Site()
cat = pywikibot.Category(site, 'Category:Example')
pages = cat.articles()
for page in pagegenerators.PreloadingGenerator(pages, 100):
    # some treatment of generated pages

Subcategories explorer:

gen = pagegenerators.CategorizedPageGenerator(cat, recurse=True)

MySQL requests (see Manual:Pywikibot/MySQL ):

gen = pagegenerators.MySQLPageGenerator(query)


Unicode recommendation

The following code returns KeyError: 'query' because of the special character:

gen = pagegenerators.SearchPageGenerator(u'´', namespaces = [0])

If searching in user and mediawiki namespaces, it would look like

gen = pagegenerators.SearchPageGenerator(u'´', namespaces = [2, 8])

Consequently, an encoding conversion is needed:

gen = pagegenerators.SearchPageGenerator("´", namespaces = [0])

See also