Talk:Quarry

Jump to navigation Jump to search

About this board

Previous discussion was archived at Talk:Quarry/Archive 1 on 2015-04-17. Discussion area for discussion about Quarry itself and help about individual queries.

Query using LAG() takes a long time

3
RoySmith (talkcontribs)

I want to look at the intervals between successive revisions. I tried (against enwiki_p):

select rev_id, rev_timestamp, lag(rev_timestamp) over (order by rev_id)
from revision
limit 5;

but that's running for a long time; I assume it'll time out eventually. It's as if it were doing a full table scan. What am I doing wrong?

BDavis (WMF) (talkcontribs)

The explain I got by using https://sql-optimizer.toolforge.org/ looks like index would be used, but your query has no WHERE clause to limit it's scope. The LIMIT clause truncates the returned results, but having an ORDER BY involved requires the database engine to produce and order the data set before truncating to the 5 desired results. The sheer size of the enwiki revision table (913,325,481 rows) makes operations on it without WHERE filtering very time and memory expensive, especially when ORDER BY is also involved.

Results

id select_type table type possible_keys key key_len ref rows Extra
1.1 SIMPLE revision index page_timestamp 20 913325481 Using index; Using temporary
RoySmith (talkcontribs)

OK, that makes sense. I assumed that since rev_timestamp was indexed, it would be smart enough to do the right thing, but I guess not.

Reply to "Query using LAG() takes a long time"

New way to enter a database name

9
GeoffreyT2000 (talkcontribs)

Goodbye USE databasename; and hello typing databasename in the box above the word "SQL".

So, if you have any old queries, then before running them again, you should remove the "USE" statement at the top and instead type in the database name in the box above the word "SQL".

Bdijkstra (talkcontribs)

Is the box going to have a text next to it saying that it's for the database name? Also, nothing currently happens when I click 'Submit'.

CommanderWaterford (talkcontribs)

Let me say that this is absolute bullsh** now, no matter which DB Name I enter ...nothing happens..query never gets executed.

Matěj Suchánek (talkcontribs)

There are reports of some problems, please be patient or share yours there.

GregorB (talkcontribs)

This is OK I guess, although I don't see any real benefit. Maybe if database prompt was a drop down list.

Matěj Suchánek (talkcontribs)

Such a drop down list would be around 900 items long...

Wbm1058 (talkcontribs)

I don't have a USE databasemame; in my query, which used to work before and now is broken without explanation.

Where can I find a list of these 900 darabases?

What is the name of the English Wikipedia database?

Wbm1058 (talkcontribs)

That's "databases". How can I edit my post to correct a typo?

Wbm1058 (talkcontribs)

I see it worked with "enwiki". You could make that the default to give ppl a clue

Reply to "New way to enter a database name"
Whatamidoing (WMF) (talkcontribs)

Is there a way to delete one of my queries? Or if I don't want an endless collection of broken/useless/duplicate/test queries, is the only solution to continuously re-write one sandbox query?

Achim55 (talkcontribs)

AFAIK there is no way to delete them. But you can blank the query, as it's done by many users.

BDavis (WMF) (talkcontribs)
Reply to "Deleting queries"

wikidatawiki_p.* not working

5
Summary by BDavis (WMF)

"cross wiki" joins are no longer possible due to technical changes that we have been forced to make on the backing wiki replica database servers. See wikitech:News/Wiki Replicas 2020 Redesign and phab:T260389 for more detailed information.

ZI Jony (talkcontribs)
BDavis (WMF) (talkcontribs)

wikidatawiki_p is a database name. This query is attempting a "cross wiki join" which is no longer possible due to technical changes that we have been forced to make on the backing wiki replica database servers. See wikitech:News/Wiki Replicas 2020 Redesign and phab:T260389 for more detailed information.

ZI Jony (talkcontribs)

Thanks for your info. Could someone rewrite the quarry, so it can be queried.

Matěj Suchánek (talkcontribs)

No, it isn't possible to do this kind of query using Quarry anymore.

ZI Jony (talkcontribs)

Thank you for response.

Queries not being executed for several hours now

1
Summary by CommanderWaterford

Solved

CommanderWaterford (talkcontribs)

Anyone working on this?

Kanzat (talkcontribs)
Kanzat (talkcontribs)
Kanzat (talkcontribs)

But if someone knows a faster query - please tell me.

Matěj Suchánek (talkcontribs)

You can probably optimize using actor_revision view.

Reply to "Searching unreviewed pages"

Searching all public quarries

2
Bennylin (talkcontribs)

Hi, is it possible to search for all public quarries? Right now we can only browse the most recent ones. Is it not available by design or for some other reasons? TY.

Bennylin (talkcontribs)
Reply to "Searching all public quarries"
91.160.58.193 (talkcontribs)

Hello,

I wonder if my research is possible.

I'm just trying to get a list of the names of all the 'new' persons that have been entered in wiki in the last two months (so I should find there Stella Morris, George Floyd, Derek Chauvin etc... and a lot of people who were unknown 2 months ago).

I suppose I should adapt this query but I don't know how.

Thanks for your help if you can

USE wikidatawiki_p;

SELECT rev_id AS first_edit FROM revision WHERE rev_timestamp BETWEEN "20200501" AND "20200624" ORDER BY RAND() LIMIT 5000;

Quarry User (talkcontribs)

User account creation log is stored on logging table. Try this query!

quarry.wmflabs.org/query/53090

Reply to "New People in Wiki"

comment_id column in archive table

3
Chaduvari (talkcontribs)

Where can I see the comment_id of the deleted edits of a page? At the time deletion, when the rows are moved from Revision table to Archive table, I thought, the rev_comment_id is moved to ar_comment_id. But the comment_id of the Archive table (in tewiki_p) is showing null for all the rows. Can somebody help me where I can find the comment_id

Matěj Suchánek (talkcontribs)

Isn't that on purpose, so that deleted stuff is hidden from ordinary users?

Chaduvari (talkcontribs)

Got it. Thank you

Query returning very small # of rows

3
Novem Linguae (talkcontribs)

Hey there. I'm trying to run a query to grab all usernames with 10,000+ edits on en-wiki. But it's only returning random small numbers of users (like 50-200 each time). Any tips? Thanks.


SELECT user_name, user_editcount

FROM user

WHERE user_editcount >= 10000

ORDER BY user_editcount DESC;

Matěj Suchánek (talkcontribs)

In 94 seconds, it returns 10,002 results for me. Maybe you are not querying enwiki (USE enwiki_p)?

Novem Linguae (talkcontribs)

User:Matěj Suchánek, thanks for the reply. The query worked for me about 15 minutes after I posted this, without modifications. It appears to have been a random glitch in the system. Thanks for your help.