Talk:Quarry

Jump to: navigation, search

About this board

Discussion area for discussion about Quarry itself and help about individual queries.

Extract the list of pages linked to commons

2
وهراني (talkcontribs)

I want to extract the list of pages linked to commons but i don't know which database to use (enwiki_p or commonswiki_p) and how ?

Thanks in advance!

Halfak (WMF) (talkcontribs)

It looks like you can use the iwlinks table from enwiki. E.g. https://quarry.wmflabs.org/query/24176 I think that links to commons show up as "iwl_prefix" of "c". I've not been able to confirm this yet though.

Reply to "Extract the list of pages linked to commons"
132.73.198.144 (talkcontribs)

Hey guys,

Is there a way to know the number of entries/rows of a table stright away without running a count(*) query?

Right now I'm trying to understand the size of templatelinks but could be relevant for any other table.

Thanks in advance!

Bsdla (talkcontribs)

Can anybody please help? any clue? I'm sure the table size is mentioned somewhere.

Thanks

Milimetric (WMF) (talkcontribs)

The database engine keeps track of estimated table sizes but you can't access those through Quarry because the information is in the system database. Giving access to the system database is not a great idea, so select count(*) is about the only way you can see the size of the table.

Halfak (WMF) (talkcontribs)

You can do something to get the count more quickly. If you run a query like this: https://quarry.wmflabs.org/query/23874

SELECT COUNT(*) FROM revision;

And then click on "Explain" while the query executes, you'll see that the query engine predicts that it will need to scan 716678936 rows -- which is a good approximation of how tall the table is.

Bsdla (talkcontribs)

Thank you both.

Halfak, I'll try that.

Reply to "Number of entries"
BD2412 (talkcontribs)

I have over 350,000 en.Wikipedia edits, so Xtools won't show me how many articles I have created. Can I get a count (or a tool to get a count) here? ~~~~

Milimetric (WMF) (talkcontribs)

Yes, if you know your user_id, you can query like this:

select count(*)
  from revision
 where rev_user = <<your-user-id>>
   and rev_parent_id = 0

When rev_parent_id is 0, that means that's the first revision on the page, so it's the creation of the page. The rev_user field is the editor responsible for the revision.

BD2412 (talkcontribs)

I'll give it a try, thanks.

Reply to "How many articles have I written?"
Nickshel81 (talkcontribs)

Hi Everyone,

I was wondering if anyone can help me with a query i am trying to run on Quarry. I would like to get a dataset of all Oil & Gas companies and their (name, location, total assets, total revenues, total reserves). Eventually i would also like to get a list of oil and gas service providers with the same parameters/columns.

Anyway anyone can help me build this query or guide me on how to create it? I am new to sql and i don't understand the database schema very well for wikipedia.

thanks again for your help

Matěj Suchánek (talkcontribs)

Quarry deals with technical information, like links between pages. Structured data is provided by Wikidata via Wikidata Query Service.

Nickshel81 (talkcontribs)

Thanks Matej. I have actually tried using Wikidata query service and the results were not consistent. I will post my question their along with the query i executed and see if anyone can help me refine it.

thanks again for your help.

Reply to "Help with Query"
Balligho (talkcontribs)

Hi,

I installed Mediawiki and imported the ENTIRE database.. but alot of articles are missing, it shows a message like this:

***************************************************

Medicine

From Wikipedia, the free encyclopedia

The revision #804651416 of the page named "Medicine" does not exist.

This is usually caused by following an outdated history link to a page that has been deleted. Details can be found in the deletion log.

***************************************************

any help please?

Thanks..

Framawiki (talkcontribs)

Hello, it looks like it is out of scope of Talk:Quarry. Perhaps you can find more information at Support Desk.

Balligho (talkcontribs)

Hi,

I need to collect all medical related articles in medical categories and put in a separate website containing a medical encyclopedia.. any clue how to collect such data in an sql query?

Thanks..

Balligho (talkcontribs)

any reply please? I have mediawiki installed I need the database only..

Thanks..

Balligho (talkcontribs)

I can pay money if required..

Halfak (WMF) (talkcontribs)

This query will get you all the titles of Medical Related articles. https://quarry.wmflabs.org/query/23504 From this, you can export/import using Special:Export. For example, en:Special:Export/Warren_Fales_Draper

If you do copy content over to another site, make sure to carry the attribution from Wikipedia as per CC-By-SA https://creativecommons.org/licenses/by-sa/4.0/legalcode or you might get Wikimedia's lawyers to come to shut you down.

Balligho (talkcontribs)

I'm sorry I didn't understand the legal thing, I'm gonna keep everything as it is and people will know its wikipedia..

Halfak (WMF) (talkcontribs)

I believe that if you clearly link directly back to the article you copy over, that will work just fine.  :)

(I am not a lawyer. This is not legal advice.)

Reply to "Medical Encyclopedia.."
Bsdla (talkcontribs)

Hi guys!

I want to fetch the "user" table Wikipedia has. I know Wikipedia limits the fields we can view but the ones open are just fine for me.

I understood such large queries will not work in "QUARRY" and also when looking in the dumps I could not find any SQL or XML dump of that table.

Any help you might have will be appreciated!

Halfak (WMF) (talkcontribs)

Indeed I don't believe there is a dump of this information which, it seems to me, is an oversight. So I've created a task: T181720

Halfak (WMF) (talkcontribs)

Looks like this was already requested a long time ago. I've merged my ticket with the old one: T51132

Bsdla (talkcontribs)

Thank you so much!

You think this is something that would be done anytime soon? As a part of my research, I'm trying to have some insights on Wikipedia users which would benefit the community and it's a bit urgent.

Halfak (WMF) (talkcontribs)

I might be able to help you get a one time dump of the current user table. Could you tell me more about the study?

If you're planning to share the results with the community, I recommend starting up a project page at m:R:New project.  :)

Bsdla (talkcontribs)

Sure! thanks about that.

I'll try to write you an email soon :)

Reply to "User Table"
Bsdla (talkcontribs)

Hi guys!

One more important question I forgot to ask..

I am also looking for "revision" table. Again- the same problem with large queries :)

So I looked in the dumps and found the following:

when searching "revision" I got to 2 dumps that I think might help me- "Recombine first-pass for page XML data dumps" and "First-pass for page XML data dumps"

What is the difference between them? they both state at the beginning "These files contain no page text, only revision metadata."

Also, when looking into these sections the files there are divided to 3: stub-meta-historystub-meta-current, stub-articles.

What is the difference between them in the means of what fields they include? of what sections in the site and of what time-frame?

My goal is to fetch for every user his whole edit history including- on which pages, how long was the edit, and the date of the edit.

Thanks for any help that may come! :)

Edgars2007 (talkcontribs)

I would advise to ask this specific question at enwiki w:WP:VPT, if nobody answers here.

Bsdla (talkcontribs)

Thanks,

I'll make sure to ask there now as well.

Halfak (WMF) (talkcontribs)
stub-meta-history
Contains metadata about revisions to all pages
stub-meta-current
contains metadata for only the most recent revision of all pages.
stub-articles
contains metadata for only the most recent revision of "content" pages (articles, templates, etc.)
Bsdla (talkcontribs)

Thank!

Therefore I understand stub-meta-history contains recent and old version together right?

Halfak (WMF) (talkcontribs)

That's right. :)

Reply to "Revision Table"
African Hope (talkcontribs)

Hello,

Do you know how to get the list of pages that reuse a Commons image? A kind of "What Links here". I would like to know which wikipedia (or any other wiki projet) uses images from a certain category. Any clue?

Achim55 (talkcontribs)

USE commonswiki_p;

DESCRIBE globalimagelinks;

Unfortunately there is no detailed description at Manual:Database layout.

African Hope (talkcontribs)

Thanks very much@Achim55, Implemented here to get the number of images reused by Côte d'Ivoire during Wiki Loves Africa. Few adjustments left but it looks good so far.

Reply to "A kind of What Links here"
Jon Harald Søby (WMNO) (talkcontribs)

Hi! We're planning on running a survey on the Northern Sami Wikipedia, and I think the most efficient way to get a list of who to get in touch with would be to run a query on Quarry. But I don't know how to construct such a query (never used SQL…), so I'm wondering if someone could help me. What I want is basically just a list of all registered users who have made more than 10 edits ever. Can someone help me with that?

Edgars2007 (talkcontribs)

https://quarry.wmflabs.org/query/23113

Edgars2007 (talkcontribs)

Note that user_editcount let's say isn't very precise, it could be some(?) edits off the real numbers. But for this purpose it probably will be fine.

Jon Harald Søby (WMNO) (talkcontribs)

Thank you very much, Edgars2007!

Reply to "List of users with at least 10 edits"