Manual:Pywikibot/Cookbook/Introduction

From mediawiki.org

Creating a script[edit]

Encoding and environment
It is vital that all Python 3 source files MUST[1] be UTF-8 without a BOM. Therefore it is a good idea to forget the bare Notepad of Windows forever, because it has the habit to soil files with BOM. The minimal suggested editor is Notepad++, which is developed for programming purposes and is cross-platform. It has an Encoding menu where you see what I am speaking about, and you may set UTF-8 without BOM as default encoding. Any real programming IDE will do the job properly, e.g. Visual Studio Code is quite popular nowadays. Python has an integrated editor called IDLE, which uses proper encoding and shows line numbers by default.
Where to put
scripts/userscripts directory is designed to host your scripts. This is a great idea, because this directory will be untouched when you update Pywikibot, and you can easily backup your own work, regarding just this directory.
You may also create your own directory structure. If you would like to use other than the default, search for user_script_paths in user-config.py, and you will see the solution.
See also [wrapper script].

Running a script[edit]

You have basically two ways. The recommended one is to call your script through pwb.py. Your prompt should be in Pywikibot root directory where pwb.py is and use:

python pwb.py <global options> <name_of_script> <options>

However, if you don't need these features, especially if you don't use global options and don't want pwb.py to handle command line arguments, you are free to run the script directly from userscripts directory.

Coding style[edit]

Of course, we have PEP 8, Python Coding conventions and Pywikibot/Development/Guidelines. But sometimes we feel like just hacking a small piece of code for ourselves and not bothering the style.

Several times a small piece of temporary code begins to grow beyond our initial expectations, and we have to clean it.

If you'll take my advice, do what you want, but my experience is that it is always worth to code for myself as if I coded for the world.

On the other side, when you use Pywikibot interactively (see below), it is normal to be lazy and use abbreviations and aliases. For example

>>> import pywikibot as py
>>> import pywikibot.pagegenerators as pg

Note that the py alias cannot be used in the second import. It will be useful later, e.g. for py.Site().

However, in this cookbook we won't use these abbreviations for better readability.

Beginning and ending[edit]

In most cases you see something like this in the very first line of Pywkibot scripts:

#!/usr/bin/python or #!/usr/bin/env python3

This is a shebang. If you use a Unix-like system, you know what it is for. If you run your scripts on Windows, you may just omit this line, it does not do anything. But it can be a good idea to use anyway in order someday others want to use your script.

The very last two lines of the scripts also follow a pattern. They usually look like this:

if __name__ == '__main__':
    main()

This is a good practice in Python. When you run the script directly from command line (that's what we call directory mode), the condition will be true, and the main() function will be called. That's where you handle arguments and start the process. On the other side, if you import the script (that is the library mode), the condition evaluates to false, and nothing happens (just the lines on the main level of your script will be executed). Thus you may directly call the function or method you need.

You may see a practical example in the Follow your bot section.

Scripting vs interactive use[edit]

For proper work we use scripts. But there is an interesting way of creating a sandbox. Just go to your Pywikibot root directory (where pwb.py is), type:

pwb.py shell

This is a short way to invoke the Python shell and importing pywikibot at once. Now you can continue with Pywikibot like:

>>> site = pywikibot.Site()

Now you are in the world of Pywikibot (if user-config.py is properly set). This is great for trying, experimenting, even for small and rapid tasks. For example to change several occurences of Pywikipedia to Pywikibot on an outdated community page just type:

>>> page = pywikibot.Page(site, 'titlecomeshere')
>>> page.text = page.text.replace('Pywikipedia', 'Pywikibot')
>>> page.save('Pywikibot forever!')

Throughout this document >>> prompt indicates that we are in the interactive shell. You are encouraged to play with this toy. Where this prompt is not present, the code lines have to be saved into a Python source file. Of course, when you use save(), it goes live on your wiki, so be careful. You may also set the testwiki as your site to avoid problems.

A big advantage of shell is that you may omit the print() function. In most cases

page.title()

equals to

print(page.title())

Walking the namespaces section shows a rare exception when these are not equivalent, and we can take advantage of the difference for understanding what happens.

Documentation and help[edit]

We have three levels of documentation. As you go forward into understanding Pywikibot, you will become more and more familiar with these levels.

  1. Manual:Pywikibot – written by humans for humans. This is recommended for beginners. It also has a "Get help" box.
  2. pywikibot – mostly autogenerated technical documentation with all the fine details you are looking for. Click on stable if you use the latest deployed stable version of Pywikibot (this is recommended unless you want to develop the framework itself), and on master if you use the actual version that is still under development. Differences are usually small.
  3. The code itself. It is useful if you don't find something in the documentation or you want to find working solutions and good practices. You may reach it from the above docs (most classes and methods have a source link) or from your computer.

Notes[edit]

  1. The full-capitalised MUST has a special meaning in programming style guides, see RFC 2119.