API:Article ideas generator

From MediaWiki.org
Jump to navigation Jump to search

Overview[edit]

In this tutorial, you will get a demo of an article ideas generator app that suggests articles from various categories that don't yet exist on English Wikipedia.

Download the code from Github Browse the app on Toolforge

This tutorial will teach you how to do this using:

A step-by-step process to building this application:

Step 1: Set up Python and Flask development environment[edit]

To set up the Python development environment for a Flask application, you will need to install Python, create a virtual environment and install Flask.

Note: This application uses Python3, the recommended version for new Python projects. Learn more about the differences between Python2 and Python3 here. To install Python3 on your local machine, follow step-by-step instructions in these installation guides.

Here is how to set up the development environment for building the application:

$ mkdir article-ideas-generator
$ cd article-ideas-generator/
This will create a new directory and change into it
$ python3 --version #Python 3.6.5
This command checks your Python version 
$ python3 -m venv venv
This command will create a virtual environment named 'venv'
$ source venv/bin/activate
This will activate the virtual environment
$ pip install Flask
This command will install the Flask package with all its dependencies

Step 2: Create a simple Flask application[edit]

Render a simple static page[edit]

Place the following code in $HOME/article-ideas-generator/articles.py

#!/usr/bin/python3

"""
    articles.py

    MediaWiki Action API Code Samples

    Article ideas generator app: suggests articles from various categories
    that don't yet exist on English Wikipedia. The app uses action=parse 
    module and prop=links module as a generator.

    MIT license
"""

from flask import Flask, render_template

app = Flask(__name__)

@app.route('/')
def index():
    """ Displays the index page accessible at '/'
    """
    return render_template('places.html')

if __name__ == '__main__':
    app.run()

Drop this one line of code <h1>Article ideas generator</h1> in a HTML file inside the templates folder: $article-ideas-generator/templates/articles.html

Note: In this simple application, we are using render_template method which renders the template named articles.html from the templates directory.

Next run your flask app with the command python articles.py and open http://127.0.0.1:5000/ to view your app in the browser. You should be able to see "Article ideas generator" in your browser window.

Style your app[edit]

Let's do some app styling. To do so, add link tags to load an external and internal stylesheet. External stylesheet, in this case, is the URL of a CSS file for a Google Font Amatic.

Replace the existing code in $article-ideas-generator/templates/articles.html with the following:

<link rel="stylesheet" href="//fonts.googleapis.com/css?family=Amatic+SC:700">
<link rel="stylesheet" href="//fonts.googleapis.com/css?family=Josefin+Sans">
<link rel="stylesheet" href="/static/style.css">

<h1>Article ideas generator</h1>
<p>Some ideas for topics to write articles on:</p>

Place the following code in $HOME/article-ideas-generator/static/static.css

h1 {
    color: black;
    font-family: 'Amatic SC', cursive;
    font-size: 4.5em;
    font-weight: normal;
}

p {
    font-family: 'Josefin Sans', sans-serif;
    font-size: 1.4em;
}
Article ideas generator demo app

Application layout[edit]

$HOME/article-ideas-generator
├── templates/
│   └── articles.html
├── static/
│   └── static.css
├── articles.py
└── venv/

Step 3: Fetch page sections from Wikipedia:Requested_articles[edit]

Let's write some code in a get_page_sections() function in $HOME/article-ideas-generator/articles.py to fetch page sections from Wikipedia:Requested_articles. This function takes page name as an argument and makes a GET request to the Action API to parse sections of the page. API call consists of an endpoint https://en.wikipedia.org/w/api.php and query string parameters. Some of the key parameters are:

  • action=parse module to parse content on a page
  • page=page page title to parse
  • prop=sections tells which piece of information to retrieve, in this example it is sections

Note: For more information on the parse module visit API:Parse.

def get_page_sections(page):
    """ Get page sections
    """
    params = {
        "action": "parse",
        "page": page,
        "prop": "sections",
        "format": "json"
    }

    res = SESSION.get(url=API_ENDPOINT, params=params)
    data = res.json()

    if 'error' in data:
        return

    parsed_sections = data and data['parse'] and data['parse']['sections']
    sections = []

    for section in parsed_sections:
        if section['toclevel'] == 1:
            sections.append(section['line'])

    return sections

Next, extend the Python Flask route / in $HOME/article-ideas-generator/articles.py to call the function defined above and also pass the results returned by the function to render_template.

@APP.route('/')
def index():
    """ Displays the index page accessible at '/'
    """
    global PAGE
    results = []

    PAGE = {'name': 'Wikipedia:Requested_articles', 'type': 'category'}
    results = get_page_sections(PAGE['name'])

    return render_template(
        "articles.html",
        results=results,
        pagetype=PAGE['type'])

Place the following Jinjatemplate code in $HOME/article-ideas-generator/templates/articles.html. It dynamically renders an array of buttons with help from page sections data as categories obtained via the API above.

{% if results %}
<p>Choose a {{ pagetype }}</p>
<form method="POST">
{% for pagename in results %}
<button name="{{ pagetype }}" class="{{ pagetype }}" value="{{ pagename }}">{{ pagename }}</button>
{% endfor %}
{% else %}
<p>Ooooops! We couldn't find any results.</p>
<button onclick="location.href='/'">Start over</button>
</form>
{% endif %}

Place the following code in $HOME/article-ideas-generator/static/static.css for button styling.

div {
    left: 10%;
    position: absolute;
    right: 10%;
    text-align: center;
    top: 5%;
}

button {
    background-color: #06b6c9;
    border: none;
    border-radius: 5px;
    color: white;
    font-size: 1.2em;
    margin: 5px;
    padding: 20px;
}
Choose a category page in the demo app

Step 4: Get more sections based on user selection[edit]

Based on a category or section user chooses in the previous step, fetch subsections from Wikipedia:Requested_articles. Extend the Python Flask route / in $HOME/article-ideas-generator/articles.py to handle POST requests. You can do so by adding both GET and POST in the methods argument list in the route decorator. You can then obtain category selection available in a dictionary format from the request object, which is passed to get_page_sections() function for further processing.

# Modify the APP route to support both GET and POST requests
@APP.route('/', methods=['GET', 'POST'])

# Add these lines in the index() function
if request.method == 'POST':
    PAGE['name'] = PAGE['name'] + '/' + \
        request.form.to_dict()['category']
    PAGE['type'] = 'subcategory'
Choose a subcategory page in the demo app. Showing subcategories for category Natural Sciences (see screenshot above)

Step 5: Collect and display articles with missing links[edit]

Let's write some code in a get_red_links() function in $HOME/article-ideas-generator/articles.py to fetch around 20 articles with missing links on a page. This function takes page name as an argument and makes a GET request to the Action API and return all links embedded on the provided page. From further extraction, you can obtain those links that are missing and don't yet exist on English Wikipedia. API call consists of an endpoint https://en.wikipedia.org/w/api.php and query string parameters. Some of the key parameters are:

  • action=query module to query information
  • titles=title page title to collect links
  • generator=links query module's submodule links used as a generator module to get a set of links embedded on a page
  • gpllimit=20 number of links to fetch

Note: For more information on the parse module visit API:Links.

def get_red_links(title):
    """ Get missing links on a page
    """
    params = {
        "action": "query",
        "titles": title,
        "generator": "links",
        "gpllimit": 20,
        "format": "json"
    }

    res = SESSION.get(url=API_ENDPOINT, params=params)
    data = res.json()
    pages = data and data['query'] and data['query']['pages']
    links = []

    for page in pages.values():
        if 'missing' in page:
            links.append(page['title'])

    return links

Next, extend the if block concerning the POST method in / route in $HOME/article-ideas-generator/articles.py to call the get_red_links() function if the page from which the request is obtained is of type subcategory.

if request.method == 'POST':
    if 'category' in request.form:
        PAGE['name'] = PAGE['name'] + '/' + request.form.to_dict()['category']
        PAGE['type'] = 'subcategory'
        results = get_page_sections(PAGE['name'])
    elif 'subcategory' in request.form:
        PAGE['name'] = PAGE['name'] + '#' + request.form.to_dict()['subcategory']
        PAGE['type'] = 'links'
        results = get_red_links(PAGE['name'])

Place the following Jinjatemplate code in $HOME/article-ideas-generator/templates/articles.html. It dynamically renders a list of links with help from the data obtained via the API above.

{% if 'links' in pagetype %}
    <p>Some ideas for topics to write articles on:</p>
    {% for link in results %}
      <a href="//en.wikipedia.org/w/index.php?title={{ link }}&action=edit&redlink=1">{{ link }}</a><br>
    {% endfor %}
    <button onclick="location.href='/'">Take me to the homepage</button>
{% endif %}
Missing links page in the demo app

View complete Python, CSS and HTML code.


Next steps[edit]

See also[edit]