So, why has the URL been changed?
Jump to navigation Jump to search
Reply to "New URL"
Reply to "User Interface for Queries"
Reply to "New way to enter a database name"
Reply to "Query using LAG() takes a long time"
Reply to "Deleting queries"
Reply to "Searching unreviewed pages"
You can probably optimize using
Reply to "Searching all public quarries"
Reply to "New People in Wiki"
About this board
Previous discussion was archived at on 2015-04-17. Discussion area for discussion about Quarry itself and help about individual queries.
We consider wmflabs.org hostnames to be "legacy" these days. Wmcloud.org hostnames are the common replacement. The Cloud VPS instances running Quarry were recently replaced with newer versions and updating the hostname for the service was a convenient side effect. As you noticed, the legacy hostname and URLs still work, but they will be redirected to the new hostname. There are more updates for Quarry planned for the coming months, see https://techblog.wikimedia.org/2021/09/02/digging-deeper-into-quarry/.
Maybe to reflect the fact that Quarry is "Hosted on Wikimedia Cloud VPS"?
User Interface for Queries
At the Wikimania Hackathon there was a discussion about Quarry and PAWS and how they could be improved. One suggestion was as I understand the Etherpad https://etherpad.wikimedia.org/p/Wikimania-2021-PAWS-and-Quarry-discussion to create something like VizQuery for querying Quarry. Has someone of you expierence with creating an user interface for an specific query. So that it is possible to enter values into a form instead of editing the SQL-Text of the Query directly. Such an Interface should include possible values that can be selected through a drop down menue or with a search function. I think a drop down menue is enough. From my point of view this is helpful if it exists.
I too think that having autocomplete for tables and columns, as well as explanations about what each of them are, is probably already a very significant improvement. Can probably be built quite easily with CodeMirror. Not visual, but it's a step forward.
New way to enter a database name
USE databasename; and hello typing
databasename in the box above the word "SQL".
So, if you have any old queries, then before running them again, you should remove the "USE" statement at the top and instead type in the database name in the box above the word "SQL".
Is the box going to have a text next to it saying that it's for the database name? Also, nothing currently happens when I click 'Submit'.
Let me say that this is absolute bullsh** now, no matter which DB Name I enter ...nothing happens..query never gets executed.
There are reports of some problems, please be patient or share yours there.
This is OK I guess, although I don't see any real benefit. Maybe if database prompt was a drop down list.
Such a drop down list would be around 900 items long...
I don't have a USE databasemame; in my query, which used to work before and now is broken without explanation.
Where can I find a list of these 900 darabases?
What is the name of the English Wikipedia database?
That's "databases". How can I edit my post to correct a typo?
I see it worked with "enwiki". You could make that the default to give ppl a clue
I had an old query that cross referenced commonswiki_p and enwiki_p to find redirects on enwiki hiding commons files. Is there a new way to reference multiple databases, or have we just lost this ability?
It's lost (forever).
Query using LAG() takes a long time
I want to look at the intervals between successive revisions. I tried (against enwiki_p):
select rev_id, rev_timestamp, lag(rev_timestamp) over (order by rev_id) from revision limit 5;
but that's running for a long time; I assume it'll time out eventually. It's as if it were doing a full table scan. What am I doing wrong?
The explain I got by using https://sql-optimizer.toolforge.org/ looks like index would be used, but your query has no WHERE clause to limit it's scope. The LIMIT clause truncates the returned results, but having an ORDER BY involved requires the database engine to produce and order the data set before truncating to the 5 desired results. The sheer size of the enwiki revision table (913,325,481 rows) makes operations on it without WHERE filtering very time and memory expensive, especially when ORDER BY is also involved.
|1.1||SIMPLE||revision||index||page_timestamp||20||913325481||Using index; Using temporary|
OK, that makes sense. I assumed that since rev_timestamp was indexed, it would be smart enough to do the right thing, but I guess not.
Is there a way to delete one of my queries? Or if I don't want an endless collection of broken/useless/duplicate/test queries, is the only solution to continuously re-write one sandbox query?
AFAIK there is no way to delete them. But you can blank the query, as it's done by many users.
phab:T135908 is the open task for adding deletion.
wikidatawiki_p.* not working
wikidatawiki_p is a database name. This query is attempting a "cross wiki join" which is no longer possible due to technical changes that we have been forced to make on the backing wiki replica database servers. See wikitech:News/Wiki Replicas 2020 Redesign and phab:T260389 for more detailed information.
Thanks for your info. Could someone rewrite the quarry, so it can be queried.
No, it isn't possible to do this kind of query using Quarry anymore.
Thank you for response.
Queries not being executed for several hours now
Anyone working on this?
Searching unreviewed pages
Hi, I need to retrieve all unreviewed pages (like https://uk.wikipedia.org/wiki/Special:UnreviewedPages ) and can't find the way how to do that. Does anyone know?
Just realized - https://quarry.wmflabs.org/query/53308 need to search for "fp_reviewed is null".
But if someone knows a faster query - please tell me.
You can probably optimize using
Searching all public quarries
Hi, is it possible to search for all public quarries? Right now we can only browse the most recent ones. Is it not available by design or for some other reasons? TY.
oh, nvm, I found it here https://www.google.com/search?q=site%3Aquarry.wmflabs.org via https://meta.wikimedia.org/wiki/Research:Quarry
New People in Wiki
I wonder if my research is possible.
I'm just trying to get a list of the names of all the 'new' persons that have been entered in wiki in the last two months (so I should find there Stella Morris, George Floyd, Derek Chauvin etc... and a lot of people who were unknown 2 months ago).
I suppose I should adapt this query but I don't know how.
Thanks for your help if you can
SELECT rev_id AS first_edit FROM revision WHERE rev_timestamp BETWEEN "20200501" AND "20200624" ORDER BY RAND() LIMIT 5000;
User account creation log is stored on logging table. Try this query!