Manual:Pywikibot/Cookbook/Working with namespaces

Walking the namespaces
Does your wiki have an article about Wikidata? A Category:Wikidata? A template named Wikidata or a Lua module? This piece of code answers the question: In Page.text section we got know the properties that work without parentheses;  is of the same kind. Creating page object with the namespace number is familiar from Creating a Page object from title section. This loop goes over the namespaces available in your wiki beginning from the negative indices marking pseudo namespaces such as Media and Special.

Our documentation says that  is a kind of dictionary, however it is hard to discover. The above loop went over the numerical indices of namespaces. We may use them to discover the values by means of a little trick. Only the Python prompt shows that they have a different string representation when used with  or without it –   may not be omitted in a script. This loop: will write the namespace indices and the canonical names. The latters are usually English names, but for example the #100 namespace in Hungarian Wikipedia has a Hungarian canonical name because the English Wikipedia does not have the counterpart any more. Namespaces may vary from wiki to wiki; for Wikipedia WMF sysadmins set them in the config files, but for your private wiki you may set them yourself following the documentation. If you run the above loop, you may notice that File and Category appears as  and , respectively, so this code gives you a name that is ready to link, and will appear as normal link rather than displaying an image or inserting the page into a category.

Now we have to dig into the code to see that namespace objects have absoutely different  and   methods inside. While  writes , the object name at the prompt without   uses the   method. We have to use it explicitely:

The first few lines of the result from Hungarian Wikipedia are: Namespace(id=-2, custom_name='Média', canonical_name='Media', aliases=[], case='first-letter', content=False, nonincludable=False, subpages=False) Namespace(id=-1, custom_name='Speciális', canonical_name='Special', aliases=[], case='first-letter', content=False, nonincludable=False, subpages=False) Namespace(id=0, custom_name=, canonical_name=, aliases=[], case='first-letter', content=True, nonincludable=False, subpages=False) Namespace(id=1, custom_name='Vita', canonical_name='Talk', aliases=[], case='first-letter', content=False, nonincludable=False, subpages=True) Namespace(id=2, custom_name='Szerkesztő', canonical_name='User', aliases=[], case='first-letter', content=False, nonincludable=False, subpages=True) Namespace(id=3, custom_name='Szerkesztővita', canonical_name='User talk', aliases=['User vita'], case='first-letter', content=False, nonincludable=False, subpages=True) Namespace(id=4, custom_name='Wikipédia', canonical_name='Project', aliases=['WP', 'Wikipedia'], case='first-letter', content=False, nonincludable=False, subpages=True) So custom names mean the localized names in your language, while aliases are usually abbreviations such as WP for Wikipedia or old names for backward compatibility. Now we know what we are looking for. But how to get it properly? On top level documentation suggests to use  to get the local names. But if we try to give the canonical name to the function, it will raise errors at the main namespace. After some experiments the following form works: This will write the indices, canonical (English) names and localized names side by side. There is another way that gives nicer result, but we have to guess it from the code of namespace objects. This keeps the colons:

Determine the namespace of a page
Leaning on the above results we can determine the namespace of a page in any form. We investigate an article and a user talk page. Although the namespace object we get is told to be a dictionary in the documentation, it is quite unique, and its behaviour and even the apparent type depends on what we ask. It can be equal to an integer and to several strings at the same time. This reason of this strange personality is that the default methods are overwritten. If you want to deeply understand what happens here, open  (the underscore marks that it is not intended for public use) and look for ,   and   methods. The starred command will give  on any non-Hungarian site, but the value will be    again, if you write user talk in your language.

Any of the values may be got with the dotted syntax as shown in the last three lines.

It is common to get unknown pages from a pagegenerator or a similar iterator and it may be important to know what kind of page we got. Int this example we walk through all the pages that link to an article. will show the canonical (mostly English) names,  the local names and   (without parentheses!) the numerical index of the namespace. To improve the example we think that main, file, template, category and portal are in direct scope of readers, while the others are only important for Wikipedians, and we mark this difference.

Content pages and talk pages
Let's have a rest with a much easier exercise! Another frequent task is to switch from a content page to its talk page or vice versa. We have a method to toggle and another to decide if it is in a talk namespace: Note that pseudo namespaces (such as  and , with negative index) cannot be toggled. will always return a  object except when original page is in these namespaces. In this case it will return. So if there is any chance your page may be in a pseudo namespace, be prepared to handle errors.

The next example shows how to work with content and talk pages together. Many wikis place a template on the talk pages of living persons' biographies. This template collects the talk pages into a category. We wonder if there are talk pages without the content page having a "Category:Living persons". These pages need attention from users. The first experience is that separate listing of blue pages (articles), green pages (redirects) and red (missing) pages is useful as they need a different approach.

We walk the category (see ), get the articles and search them for the living persons' categories by means of a regex (it is specific for Hungarian Wikipedia, not important here). As the purpose is to separate pages by colour, we decide to use the old approach of getting the content (see ). Note that running on errors on purpose and preferring them to s is part of the philosophy of Python.