Manual:Pywikibot/Cookbook/Working with your watchlist

We have a watchlist.py among scripts which deals with the watchlist of the bot. This does not sound too exciting. But if you have several thousand pages on your watchlist, handling it by bot may sound well. To do this you have to run the bot with your own username rather than that of the bot. Either you overwrite it in user-config.py or save the commands in a script and run python pwb.py -user:MyUserAccount myscript

Loading your watchlist
The first tool we need is. This is a pagegenerator, so you may process pages with a for loop or change it to a list. When you edit your watchlist on Wikipedia, talk pages will not appear. This is not the case for ! It will double the number by listing content pages as well as talk pages. You may want to cut it.

Print the number of watched pages: This may take a while as it goes over all the pages. For me it is 18235. Hmm, it's odd, in both meaning. :-) How can a doubled number be odd? This reveals a technical error: decades ago I watched  which was a redirect to a project page, but technically a page in article  namespace back then, having the talk page  . Meanwhile   was turned into an alias for Wikipedia namespace, getting a new talk page, and the old one remained there stuck and abandoned, causing this oddity.

If you don't have such a problem, then will do both tasks: turn the generator into a real list and throw away talk pages as dealing with them separately is senseless. Note that  takes a   argument if you want only the first , but we want to process the whole watchlist now. You should get half of the previous number if everything goes well. For me it raises an error because of the above stuck talk page. (See T331005.) I show it because it is rare and interesting: pywikibot.exceptions.InvalidTitleError: The (non-)talk page of 'Vita:WP:AZ' is a valid title in another namespace. So I have to write a loop for the same purpose: Well, the previous one looked nicer. For the number I get 9109 which is not exactly  to the previous one, but I won't bother it for now.

A basic statistics
In one way or other, we have at last a list with our watched pages. The first task is to create a statistics. I wonder how many pages are on my list by namespace. I wish I had the data in sqlite, but I don't. So a possible solution:

There is an other way if we steal the technics from  (discussed in Useful methods section). We yust generate the namespace numbers, and create of them a collections.Counter object: This is a subclass of dictionaries so may be used as a dict. The difference compared to the previous is that a  sorts items by the decreasing number automatically.

Selecting anon pages and unwatch according to a pattern
The above statistics shows that almost the half of my watchlist consists of user pages because I patrol recent changes, welcome people and warn if neccessary. And it is neccessary often. Now I focus on anons: I could use the own method of a User instance to determine if they are anons without importing, but for that I would have to convert pages to Users: Anyway,  shows that they are over 2000.

IPv4 addresses starting with  belong to schools in Hungary. Earlier most of them was static, but nowadays they are dynamic, and may belong to other school every time, so there is no point in keeping them. For unwatching I will use : With  in last line I will also see a   for each successful unwatching. Without it only unwatches. This loop will be slower than the previous. For repeated run it will write these s again because watchlist is cached. To avoid this and refresh watchlist use  which will always reload it.

Watching and unwatching a list of pages
By this time we delt with pages one by one with  which is a method of the   object. But if we look into the code, we may discover that this method uses a method of : Even more exciting, this method can handle complete lists at once, and even better the list items may be strings – this means you don't have to create Page objects of them, just provide titles. Furthermore it supports other sequence types like a generator function, so pagegenerators may be used directly.
 * https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.site.html#pywikibot.site._apisite.APISite.watch

To watch a lot of pages if you have the titles, just do this: To unwatch a lot of pages if you already have Page objects: For use of pagegenerators see the second example under.

Further ideas for existing pages
With Pywikibot you may watch or unwatch any quantity of pages easily if you can create a list or generator for them. Let your brain storm! Some patterns: API:Watch shows that MediaWiki API may have further parameters such as expiry and builtin pagegenerators. At the time of writing this article Pywikibot does not support them yet. Please hold on.
 * Pages in a category
 * Subpages of a page
 * Pages following a title pattern
 * Pages got from logs
 * Pages created by a user
 * Pages from a list page
 * Pages often visited by a returning vandal whose known socks are in a category
 * Pages based on Wikidata queries

Red pages
Non-existing pages differ from existing in we have to know the exact titles in advance to watch them.

Watch the yearly death articles in English Wikipedia for next decade so that you see when they are created:

hu:Wikipédia:Érdekességek has "Did you know?" subpages by the two hundreds. It has other subpages, and you want to watch all these tables until 10000, half of what is blue and half red. So follow the name pattern:

While English Wikipedia tends to list existing articles, in other Wikipedias list articles are to show all the relevant titles either blue or red. So the example is from Hungarian Wikipedia. Let's suppose you are interested in history of Umayyad rulers. hu:Omajjád uralkodók listája lists them but the articles of Córdoba branch are not yet written. You want to watch all of them and know when a new article is created. You notice that portals are linked from the page, but you want to watch only the articles, so you use a wrapper generator to filter the links.

List of ancient Greek rulers differs from the previous: many year numbers are linked which are not to be watched. You exclude them by title pattern. Or just to watch the red pages in the list:

In the first two examples we used standalone pages in a loop, then a pagegenerator, then lists. They all work.

Summary

 * For walking your watchlist use  generator function. Don't forget to use the   global parameter if the   contains your bot's name.
 * For watching and unwatching a single page use  and.
 * For watching and unwatching several pages at once, given them as list of titles, list of page objects or a pagegenerator use.