Manual:Pywikibot/Workshop Materials/How to host a bot on Toolforge (self-study)

This self-study guide will walk you through the process of getting access to Toolforge, setting it up for Pywikibot, and running a bot on Toolforge manually and automatically. You will learn all of this using existing materials available on different wikis.

How to use this guide?
The best way to learn using this guide is to follow it as described here:

If you want to teach others what you learned in this guide, or need a more extensive description of the steps covered here, use the materials for workshop organizers available on User:KBach-WMF/Sandbox/SWT Workshop Materials/How to host a bot on Toolforge.
 * 1) Read the entire guide without clicking any links or trying to understand the entirety of the material. This should give you a high-level overview of what is necessary to run your bots on Toolforge.
 * 2) Read the guide again, section by section. For each section:
 * 3) * read the entire section content first, without clicking any links
 * 4) * read the section again, this time opening linked pages in new tabs and going through their content. Focus on what is relevant in the context of the section of this guide you arrived from. Follow the instructions outlined on the linked pages.
 * 5) * if you need extra information about the subject covered in each section, check the list of additional resources at the end of the section
 * 6) * if any part of this guide is unclear, leave a comment on the discussion page (TODO) so that we can improve it, or edit this guide yourself

Prerequisites
Before following this self-study guide, be sure that you understand how to run Pywikibot scripts locally, on your own machine. This is covered in other small wiki toolkits workshops (see small wiki toolkits workshops), in Manual:Pywikibot, and in other self-study guides (TODO).

It will also help if you have a basic understanding of Linux terminal, Bash, and SSH. The list below provides useful resources if you want to learn more about any of these subjects.


 * wikibooks:Non-Programmer's_Tutorial_for_Python_3 - TODO: This is probably less relevant here but can be useful in other guides
 * Python - TODO: This is probably less relevant here but can be useful in other guides
 * Linux_Guide/Using_the_shell
 * Bash_Shell_Scripting
 * Internet_Technologies/SSH

What is Toolforge?


Toolforge is a shared hosting platform supported by Wikimedia Foundation staff and volunteers. It provides users with a Linux machine they can use, for example, to run bots or host a website. Toolforge is extensively documented on wikitech:Portal:Toolforge - use this portal if you have any questions or want to learn more than covered in this guide.

To use Toolforge, you must agree to its terms and conditions. For details, see wikitech:Portal:Toolforge/Quickstart.

Creating a developer account and setting up access to Toolforge
; ;

To use Toolforge for your bot, you need the following:


 * Wikimedia developer account (this is different from your Wikipedia or SUL account)
 * Toolforge membership
 * Tool account on Toolforge
 * SSH key you will use to log in to Toolforge

To fulfill these requirements, follow the Getting started with Toolforge - Quickstart guide.

If you need additional clarification on any of the steps in this process, check the list of additional resources below.

When creating the account, pay special attention to the UNIX shell username. That is the user name you will use when logging in to Toolforge.

Resources :


 * Generating a new SSH key
 * Instead of adding the SSH key to your developer account on Wikitech, you can do that directly in the admin console: https://toolsadmin.wikimedia.org/profile/settings/ssh-keys/

Logging in to Toolforge


To log in to Toolforge on Linux or macOS, use the  command in your terminal, for example. On Windows, you can try the same command in the command line, PowerShell, or Git Bash (you might have SSH installed), or follow the instructions on wikitech:Help:Access_to_Toolforge_instances_with_PuTTY_and_WinSCP.

It is not possible to log in to Toolforge using your developer account login and password. You can only log in to Toolforge using your UNIX shell username and SSH key. The only password you will need is the password for that key.

Resources :


 * Internet_Technologies/SSH

Setting up Pywikibot
To set up your tool account to run Pywikibot, follow the instructions on wikitech:Help:Toolforge/Pywikibot. Note that if you intend to use Toolforge to run a script that comes with Pywikibot, you should follow the instructions for installing Pywikibot from git. If you plan to run Pywikibot with a custom script you wrote yourself, you can install Pywikibot from PyPI.

Accessing your code on Toolforge
If you intend to run one of the built-in Pywikibot scripts, you can skip this section.

If you have a bot that you wrote yourself, you will want to make it available to the tool's account, for example in its  directory. There are multiple ways to do this.


 * You can open a new file in  or   (in edit mode), and copy the contents of your script file directly into the editor in your terminal. You can usually do this using standard copy and paste key combinations (Ctrl+C and Ctrl+P on Windows and Linux, and Command+C and Command+P on macOS), or combinations supported by your terminal application (for example Ctrl+Shift+C and Ctrl+Shift+P or Command+Shift+C and Command+Shift+P). Some terminals also allow you to right-click and choose Copy or Paste from the context menu. After pasting your code remember to save the file in the text editor.
 * You can use the  command to copy and paste a file through SSH. See Internet_Technologies/SSH for information on how to do this. You can also use FileZilla as described in wikitech:Help:Access_to_Toolforge_instances_with_PuTTY_and_WinSCP.
 * You can commit your code into a git repository, for example on GitLab, and then run  after switching to your tool account  (using  ).

Running your bot
To run Pywikibot on Toolforge, use the jobs framework. This method ensures that your bot runs in a dedicated environment with sufficient resources, instead of directly using the Toolforge login machine you accessed using SSH. This is especially important when your bot performs many operations on multiple pages. Running such a bot directly in shell could potentially slow down the login.toolforge.org machine and make it unusable for others.

To run your bot, either immediately in the background, or automatically - according to a schedule - follow the instructions in wikitech:Help:Toolforge/Pywikibot.

When running the bot automatically using a schedule, notice that the  option uses the crontab format to specify when the bot should run. Crontab is commonly used to schedule tasks on UNIX-based operating systems. There are many helpful resources that describe it online. One such resource is https://crontab.guru, but typing "crontab run every 5 minutes" in a search engine should return plenty of useful resources.

Note that the command you run using the jobs framework does not need to use python or pwb directly. If your bot is more complicated, for example requires that you run multiple preparatory commands consecutively, you can wrap all commands in a shell script and execute that shell script instead.

Resources:


 * wikitech:Help:Toolforge/Jobs_framework
 * wikipedia:Cron