Topic on Talk:Quarry

Jump to navigation Jump to search

Articles written by a single editor: How to improve query speed?

Syced (talkcontribs)

In my quest to find all articles that have only been edited by one human editor, I wrote this query:

USE enwiki_p;
SELECT page_title FROM (
  SELECT p.page_title, r.rev_actor, a.actor_name
  FROM (
    SELECT page_title, page_len, page_id
    FROM page
    WHERE page_namespace = 0 # Mainspace
      AND NOT page_is_redirect
  ) AS p # All mainspace pages
  LEFT JOIN revision_userindex r ON r.rev_page = p.page_id
  LEFT JOIN actor a ON r.rev_actor = a.actor_id
  WHERE NOT IS_IPV4(a.actor_name) # Ignore IP editors and bots
    AND NOT IS_IPV6(a.actor_name)
    AND LOWER(a.actor_name) NOT LIKE '%bot%'
    AND LOWER(a.actor_name) NOT LIKE '%script%'
) AS pra
GROUP BY (page_title)
HAVING COUNT(rev_actor) < 2 # Only 1 editor

Problem: It times out.
Question: How to make it run faster?

For instance, to improve speed I have thought about skipping pages that have more than 50 revisions, but I am not sure how to implement it.

Saeidpourbabak (talkcontribs)
Syced (talkcontribs)

Thanks, your is faster indeed :-)

Reply to "Articles written by a single editor: How to improve query speed?"