Topic on Talk:Quarry

Jump to navigation Jump to search
Vexations (talkcontribs)

Is there an efficient way to find articles that do not have a talk page (on enwiki)? I tried

SELECT article.page_title

  FROM page article LEFT OUTER JOIN page talkpage ON article.page_title = talkpage.page_title

  WHERE talkpage.page_id IS null

LIMIT 10 

but that didn't work.

Achim55 (talkcontribs)

The page_id has the value 0 in this case. Try for example SELECT page_title FROM page art WHERE art.page_namespace = 0 AND art.page_is_redirect = 0 AND NOT EXISTS (SELECT 1 FROM page tp WHERE tp.page_title = art.page_title AND tp.page_namespace = 1) LIMIT 100 which takes just half a second of time.

Vexations (talkcontribs)

Fantastic, thanks.

Matěj Suchánek (talkcontribs)

In your original query, the problem was that the JOIN clause didn't use indexed search because you omitted the namespace in the ON clause. In other words, if you are searching for a page by name, you must always include the namespace, too. The query:

SELECT article.page_namespace, article.page_title
FROM page article LEFT OUTER JOIN page talkpage ON article.page_title = talkpage.page_title AND article.page_namespace + 1 = talkpage.page_namespace
WHERE talkpage.page_id IS NULL

should perform equally.

Vexations (talkcontribs)

Of course, that makes sense. Thank you!