Talk:Quarry

Jump to navigation Jump to search

About this board

Discussion area for discussion about Quarry itself and help about individual queries.

Automating table of Overlong Stub Categories on WP:Stub Sorting

5
Furicorn (talkcontribs)

Hi, I'm looking for help to build 2 queries that can generate tables that can meet the criteria of the following two tables https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Stub_sorting/To_do#Oversized_stub_categories

  1. The first table lists all stub cats with 6 or more listings pages (with # of pages)
  2. The second table lists all stub cats with 5 listings pages (with # of pages)

I hope this could be a useful tool for directing stub diffusion work. Eventually, I'd like to take the generated queries and try to build a script that can regularly run the query (monthly or weekly).

HenriqueCrang (talkcontribs)
Furicorn (talkcontribs)

Thanks, I really appreciate this work! I think I worded my request a little confusingly - the query column called "cat_pages" definitely fulfills the original table's " Articles" column. However, I was trying to say something slightly different with "number of pages," but I don't have a better word, sorry. Hopefully I can do a better job communicating this, and furthermore it is something doable.

By number of pages, I'm referring to how a category with 400 articles might have 2 pages (only word I know to describe this, sorry!) in the category, but a category with 500 might have 3 pages (~200 articles/entries per page). I don't know if this information is easily retrievable, but I'm hoping it is.

So the final query would ideally have 3 columns

  1. Category (cat_title, but output w/o underscores if this is easy)
  2. Pages (# of pages in the sense described in paragraph 2, and a net new column)
  3. Articles (currently titled cat_pages)|

Thanks again for any help.

Saeidpourbabak (talkcontribs)

Hi,

If I understood you correctly, replacing the first line with "select cat_title Category, ceil(cat_pages/200) Pages, cat_pages Article" should do the job.

Furicorn (talkcontribs)
Reply to "Automating table of Overlong Stub Categories on WP:Stub Sorting"
Elisardojm (talkcontribs)

Hi, I'm newbie at Quarry and I wanted to make a query for gl.wiki but I don't know how make it, could somebody help me?

At gl.wiki we wanted to get the number of female writers of category of writers of Galicia, https://gl.wikipedia.org/wiki/Categor%C3%ADa:Escritores_de_Galicia, and subcategories. Note that an article could appear at one or more subcategories, that's a the problem.

Matěj Suchánek (talkcontribs)
Elisardojm (talkcontribs)

Thanks!

Disambiguation pages with incoming links

4
Edoderoo (talkcontribs)

I once had a Quarry that would list all pages with the Template:Dp template on it, ordered by number of incoming links. Unfortunately I lost the code/query. Right now I create it with a looped python script, but that runs for 15-16 hours to complete, where quarry would take about a minute. Can someone recreate it for me, as my tries failed all together until now.

Saeidpourbabak (talkcontribs)
Matěj Suchánek (talkcontribs)
SELECT page_title, COUNT(pl_from) AS backlinks
FROM templatelinks
JOIN page ON page_id = tl_from
JOIN pagelinks ON pl_title = page_title AND pl_namespace = page_namespace
WHERE tl_title = 'Dp' AND tl_namespace = 10
GROUP BY tl_from
ORDER BY backlinks DESC;
Edoderoo (talkcontribs)

The first one seems to give decent results, the second one has quite some false positives at first sight. Anyhow, thanks to both of you, I'm gonna work that out after my holidays!

Reply to "Disambiguation pages with incoming links"
Jarekt (talkcontribs)
Matěj Suchánek (talkcontribs)

Append page_namespace = 6 to it. This will make your query use the unique index and will be a lot faster.

Reply to "What is wrong with my Query"
Jarekt (talkcontribs)

Is page URL stored separately somewhere or is it derived from page_title somehow? I am trying to write query that returns URL's instead of page_titles.

Edgars2007 (talkcontribs)

No, I don't think so (that there are page URLs)

Reply to "page URL?"

LOWER/UPPER function not working on page_title

3
Saeidpourbabak (talkcontribs)

Hi,

Can somebody tell me why lower/upper function works properly on a given string (like 'ABC' or 'abc') but not on page_title?

Thanks

Zhuyifei1999 (talkcontribs)
Saeidpourbabak (talkcontribs)

Thanks for the quick and helpful reply. It works perfectly.

Invisigoth67 (talkcontribs)

Hi, I am looking for a substitute for the talk page tool by DrTrigon (not working since a couple of month).It should list the names of all talk pages of the corresponding Wikipedia articles (e.g. in dewiki) that belong to a specified article category (including a depth of subcategorys that can also be scepicified, e.g. 10) and where the talk pages have been changend in the last e.g. 14 days. Is this possible with a query?

Matěj Suchánek (talkcontribs)

Deep category search with SQL is very difficult unless the depth of the category is one or two.

Invisigoth67 (talkcontribs)

So if I use Petscan to give me a list of all sub categorys and the list is below the 65K query length limit and I paste it into the query (e.g. "...where category in ('American 3D films','Australian 3D films','Brazilian 3D films‎'...", will there be a way?

Matěj Suchánek (talkcontribs)

Probably yes:

USE enwiki_p;
SELECT rc_title FROM recentchanges JOIN categorylinks ON cl_from = rc_cur_id
WHERE rc_namespace = 1
AND cl_to IN ('American_3D_films', 'Australian_3D_films', 'Brazilian_3D_films‎', ...)
AND rc_timestamp >= DATE_ADD(NOW(), INTERVAL -14 DAY)
ORDER BY rc_timestamp;
Invisigoth67 (talkcontribs)

Thanks for the query, but I get null results where there should be some. Query 27673, dewiki, article "S-Bahn Wien" in category "Bahnstrecke in Wien" had talk page contributions one week ago.

Matěj Suchánek (talkcontribs)

I forgot to highlight (and wanted to do so) that you need to include underscores (_) instead of spaces in category names.

Invisigoth67 (talkcontribs)

I tried it both with underscores and spaces, but still no results, please see Query 27673.

Matěj Suchánek (talkcontribs)

Uh, I misunderstood your wish and assumed that the talk pages should be in the category, not the articles. This query should work:

USE dewiki_p;
SELECT rc_title, rc_timestamp FROM recentchanges
JOIN page ON rc_namespace = 1 AND rc_title = page_title AND page_namespace = 0
JOIN categorylinks ON cl_from = page_id
WHERE cl_to IN ('Bahnstrecke_in_Wien', 'Musiker_(Wien)')
AND rc_timestamp >= DATE_ADD(NOW(), INTERVAL -14 DAY)
ORDER BY rc_timestamp;
Invisigoth67 (talkcontribs)

Just tested the new query, this is exactly what I was looking for. Thank you very, very much!

Certes (talkcontribs)

https://quarry.wmflabs.org/query/24254 claims to be still running after several days. Should we do something to get rid of it, or was it actually killed after 30 minutes?

Also, please can anyone suggest how to improve the performance? It looks as if it should be efficient, but it doesn't seem to access the page table by the obvious index of namespace+title.

Zhuyifei1999 (talkcontribs)

The query could not store the results because of phab:T170464. The query might not be running slow.

Reply to "Query still running?"
Estopedist1 (talkcontribs)
Edgars2007 (talkcontribs)

INNER JOIN page doesn't have ON statement (the same with templatelinks).

Estopedist1 (talkcontribs)
Edgars2007 (talkcontribs)
use commonswiki_p;
select CONCAT ("# [[:File:",gil_to,"]] > [[:et:",gil_page_title,"]]")  
from globalimagelinks gil
INNER JOIN page 
ON gil_to=page_title
INNER JOIN templatelinks tl
ON tl_from = page_id
where tl_title = 'Uncategorized' AND gil_wiki = 'etwiki' 
limit 100
Estopedist1 (talkcontribs)

This code is identical to my example. This quarry doesn't work. Error:

You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ' ' at line 9
Edgars2007 (talkcontribs)
Estopedist1 (talkcontribs)

Yeah, maybe the space at the end was the problem. Now it is working. Big thanks to Latvia. By the way I live in Estonia (Igaunija) :)

Edgars2007 (talkcontribs)

Yes, I understood that from query. We Baltics have to stick together :)

XXN (talkcontribs)

Is it possible to find in Quarry pages in a wiki containing HTML comments? As table with page text is unreachable, I suppose this is not possible...

YuviPanda (talkcontribs)

Yes, there is no way to search through text with quarry, unfortunately.

Edgars2007 (talkcontribs)

You can do a wikisearch: insource:/\<!--/i Will find those pages, which has the opening tag.

XXN (talkcontribs)

is an alternative. Thank you.