Talk:Quarry

Jump to navigation Jump to search

About this board

Previous discussion was archived at Talk:Quarry/Archive 1 on 2015-04-17. Discussion area for discussion about Quarry itself and help about individual queries.

comment_id column in archive table

3
Chaduvari (talkcontribs)

Where can I see the comment_id of the deleted edits of a page? At the time deletion, when the rows are moved from Revision table to Archive table, I thought, the rev_comment_id is moved to ar_comment_id. But the comment_id of the Archive table (in tewiki_p) is showing null for all the rows. Can somebody help me where I can find the comment_id

Matěj Suchánek (talkcontribs)

Isn't that on purpose, so that deleted stuff is hidden from ordinary users?

Chaduvari (talkcontribs)

Got it. Thank you

Reply to "comment_id column in archive table"

Something wrong again with Excel XLSX download format

6
Jarekt (talkcontribs)

A while ago there was an issue with Excel XLSX download format, which was fixed by cleaning some temp files from somewhere. It seems to be happening again and an attempt to open generated spreadsheets gives invalid file format or extension error.

This post was hidden by Chaduvari (history)
Framawiki (talkcontribs)

Hello Jarekt, can you confirm the problem was solved in the meantime ? Thanks!

Jarekt (talkcontribs)

yes it is solved

Jarekt (talkcontribs)
Chaduvari (talkcontribs)

Facing the same problem since yesterday

Reply to "Something wrong again with Excel XLSX download format"

There is a way to query and get results programmatically

1
חגי1234 (talkcontribs)

Hi!

Is the url of the query can also return results via code programming?


After it execute in this path "command"

https://quarry.wmflabs.org/query/XXXX

It has the run path "command" to download it to json etc.

https://quarry.wmflabs.org/run/YYYYYY/output/0/json-lines


But there the query path and run path are not related...

Can I know the run number of my query (without html scrapping)?

Or, is there is a way to run the query and download the results using code? (without html scrapping)


Thanks!

Hagay

Reply to "There is a way to query and get results programmatically"

cookbook for "classes" of queries

2
2601:1C1:867F:EA40:D896:7EC1:1E1F:AA9A (talkcontribs)

Not sure if i am missing some docs somewhere but i am looking to perform a set of queries where by i can select the title based upon the content available in the infobox (vcard? looking at the html).


It is unclear to me how to do this base upon the schema. No doubt others would like to do this as well.


I am very familiar with MYSQL but most of the queries I have seen appear to be very complex for data that seems to be straightforward.


For example I want to find all companies that are listed in the S&P_500 .not the page listing them but a list based upon a query that lists the url in the infobox.


Is this hard? easy? Can someone point me to a sql query that can do this?


thanks

Matěj Suchánek (talkcontribs)

Infobox cannot be queried using SQL (in fact, using nothing). You can query structured data on Wikidata using SPARQL endpoint.

Reply to "cookbook for "classes" of queries"
91.160.58.193 (talkcontribs)

Hello,

I wonder if my research is possible.

I'm just trying to get a list of the names of all the 'new' persons that have been entered in wiki in the last two months (so I should find there Stella Morris, George Floyd, Derek Chauvin etc... and a lot of people who were unknown 2 months ago).

I suppose I should adapt this query but I don't know how.

Thanks for your help if you can

USE wikidatawiki_p;

SELECT rev_id AS first_edit FROM revision WHERE rev_timestamp BETWEEN "20200501" AND "20200624" ORDER BY RAND() LIMIT 5000;

Reply to "New People in Wiki"

need help / quarry does not start at all

1
ISO 3166 Bot (talkcontribs)

Hi

https://quarry.wmflabs.org/query/50304 does not start at all (since a few days). Before that ( made some change) it ran and produced sound results without any timeout. See similar https://quarry.wmflabs.org/query/50070 , whích ran and where the result is still in the cache.

There are two selects, running only the first will start and finish. Running only the second will not start. without any hint or error message, it just fails silently.

Can somebody help to find / fix the problem? best


Try https://quarry.wmflabs.org/query/50253 (quarries that are not executed are not safed, so https://quarry.wmflabs.org/query/50304 is empty). Was empty, I now played around a little bit and it seems that there is a size limitation for the sql script, / for each select statement.

got the following messages in the console:

/api/query/run:1 Failed to load resource: the server responded with a status of 500 ()
50304:1 Refused to load the image 'data:image/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==' because it violates the following Content Security Policy directive: "img-src 'self'".
/api/query/run:1 Failed to load resource: the server responded with a status of 500 ()
jquery.js:8623 POST https://quarry.wmflabs.org/api/query/run 500
send @ jquery.js:8623
ajax @ jquery.js:8152
n.<computed> @ jquery.js:8298
(anonymous) @ view.js:80
dispatch @ jquery.js:4409
r.handle @ jquery.js:4095


(its chrome)

message for status 500 is:

500 Internal Error

Sadly Quarry is currently experiencing issues and is unavailable. Please try again later.

But I have to dig into the browser debugger to see this. No message on the UI.

Reply to "need help / quarry does not start at all"
Brainbout (talkcontribs)
BDavis (WMF) (talkcontribs)

Those path components are not directly stored in the database. They are instead derived data resulting from the use of $wgHashedUploadDirectory = true; in the settings for Wikimedia wikis. These values are taken from the md5 hash of the filename. See Manual:$wgHashedUploadDirectory for more details.

Brainbout (talkcontribs)

Thank you BDavis. I'll have a look at that man page.

Reply to "URL mysterious subdirectory"
183.83.138.3 (talkcontribs)

I would like to create a list of Articles/pages related to India with page title and link(https://en.wikipedia.org)

Now there are two lists on to this

1- list where title has india (where page_title like '%India%' )

2- list where a string 'India' is present or mentioned.

Can anyone help out here with the quarry .

2 columns & 2 lists

Reply to "INDIA360"

Finding duplicated files which are available and locally and on Commons

5
Summary by Kizule

@Edgars2007 has provided answer and this question is resolved. :)

Kizule (talkcontribs)

I would like to find files on Serbian Wikipedia which are available and locally and on Commons (duplicates). Which query will work this job?

Edgars2007 (talkcontribs)
Kizule (talkcontribs)
AntiCompositeNumber (talkcontribs)
Edgars2007 (talkcontribs)

ahh :( forgot about those. thought that those redesign changes won't affect me :D

Reply to "Finding duplicated files which are available and locally and on Commons"

How to get ORES article quality predictions for an article?

1
Joe Roe (talkcontribs)

I'm having trouble understand how ORES predictions are represented in the ores_classification table.

For example, using the ORES API to get "articlequality" (57) classifications for enwiki revision 982883941 (https://ores.wikimedia.org/scores/enwiki/?models=articlequality&revids=982883941) returns six scores and a prediction.

But running what should be an identical query (as far as I understand) using quarry returns only one row, with none of the values seeming to match up: https://quarry.wmflabs.org/query/49243.

Can anyone help? What I would ultimately like to do is get the predicted article quality for a set of revision IDs.

Reply to "How to get ORES article quality predictions for an article?"