PAWS

From MediaWiki.org
Jump to navigation Jump to search
PAWS.svg

PAWS (PAWS: A Web Shell) is an online web-based interactive programming + publishing environment. It is an implementation of the popular Jupyter notebook environment for Python, shell scripts and other programming tasks.

Why?[edit]

Literacy was not widespread in Mesopotamia. The scribes ... had to undergo training, and having completed their training and become entitled to call themselves dubsar, "scribe," they were members of a privileged elite who might look with contempt on their fellow citizens.

- B. F. Walker, Cuneiform

Programming is an essentially complex activity - writing code is not easy, in the same way writing a book is not easy. However, there is a lot of accidental complexity around writing code that can and should be eliminated. Classical examples of this include

  1. Installing things (dependency hell! You need a different compiler! Wrong versions!)
  2. Terminal text editors (you mean this works completely different than everything else I have used?)
  3. ssh (try it on Windows! what is screen?),
  4. Publishing your work (So there is git, and you might hear of github, but you really should use gerrit/gogs/phabricator/gitlab)
  5. Deployment (I need to do what to let my friends use this?).

All of these are also particularly new-to-programming person unfriendly in the Wikimedia movement. We're locking out a ton of really smart & resourceful people by making them jump through the equivalent of having to learn to pilot an aircraft carrier before being allowed to turn on a shower.

PAWS is an attempt to make simple programming tasks simple, and see what wonderful uses people make of it.

It follows in the philosophical footsteps of Quarry, which allowed easy access to a very specific form of programming (SQL queries). The number and kinds of people using Quarry - and the fact that most of them would not have used Tool Labs's sqlcommand[citation needed] - suggest the hypothesis that reducing barriers to entry & giving people tools to solve their own problems results in wonderful things. PAWS is an attempt to verify this hypothesis with a more general programming environment than Quarry.

What?[edit]

PAWS provides some very core features on top of which people can build stuff.

Notebooks[edit]

It provides Jupyter notebooks (previously known as IPython Notebooks)

Web based Terminal[edit]

Useful libraries[edit]

Accessing Database Replicas With Pandas and Sqlalchemy[edit]

Pandas is a lovely high level library for in-memory data manipulations. In order to get the result of a SQL query as a pandas dataframe use:

from sqlalchemy import create_engine
import sys, os
import pandas as pd

constr = 'mysql+pymysql://{user}:{pwd}@{host}'.format(user=os.environ['MYSQL_USERNAME'],
                                                      pwd=os.environ['MYSQL_PASSWORD'],
                                                      host=os.environ['MYSQL_HOST'])
con = create_engine(constr)

df = pd.read_sql('select * from plwiki_p.logging limit 10', con)

Storage space[edit]

Publishing space[edit]

A notebook can be turned into a public notebook by publishing a link to it. This works as the notebook is made available in a read only mode. An example might be …revisions-sql.ipynb?kernel_name=python3. It could be wise to add the kernel name to the link, even if it isn't necessary in some cases.

If you want to run the copy yourself, or do interactive changes, you must download the notebook the notebook and reupload on your own account. Downloading the raw format of the previous example can be done by adding format=raw to the previous example …revisions-sql.ipynb?format=raw. This download-reupload-process is somewhat awkward.

Note that a notebook will always be published, as the link can be guessed, so don't add any private information.

Database access[edit]

Examples

Dumps access[edit]

Example Uses[edit]

  • Running a pywikibot bot script easily
  • Writing your own bot
  • Run analysis of a small editathon you did
  • Do cool things with...
    • Wikimedia MySQL database replicas
    • Wikimedia XML / JSON Dumps
    • Wikidata Query Service
    • MediaWiki API
    • Interactive charts & maps

See also[edit]