Topic on Talk:Quarry

Queries for orphaned pages don't work

30 comments • 20:06, 26 September 2017 6 years ago

30

Mess (talkcontribs)

I can't perform queries for orphaned pages on it.wiki since this week, neither with Quarry (seehttps://quarry.wmflabs.org/query/18648) or with Lists (see http://tools.wmflabs.org/lists/itwiki/Voci/Voci_orfane_per_dimensione). The syntax of the queries always worked until the coming up of this issue, but I don't get what's the reason why it doesn't complete the task now.

Reply 15:27, 24 June 2017 6 years ago

Achim55 (talkcontribs)

You are right, my queries are killed as well. It's not the first time Quarry has slown down and - surprise!! - there are again a lot of queries that are "running" or queued since some weeks. Needs a clean restart, @Yuvipanda or someone else: please reset it, thanks.

Reply 16:10, 24 June 2017 6 years ago

Milimetric (WMF) (talkcontribs)

Hm, @Yuvipanda isn't maintaining this anymore sadly, but I'm watching these pages. I'll go try and figure out how to reset it.

Reply 13:31, 26 June 2017 6 years ago

Achim55 (talkcontribs)

User:Milimetric (WMF), thank you!. It's an old problem, see Topic:T8ip24a3iragdqrh, Topic:T6sxf1vjyj5shf9e and several more. See also here:

Reply 13:43, 26 June 2017 6 years ago

Mess (talkcontribs)

User:Milimetric (WMF), may I suggest you to implement a maximum time limit of (just say) 1 hour for queued/hanged queries in order to prevent this overload? I can't figure out why some queries can avoid the regular limit of 30 minutes and can't be killed like any other query when the time period is over.

Reply 13:59, 26 June 2017 6 years ago

Milimetric (WMF) (talkcontribs)

Interesting, yeah, I thought there was a solid time limit in place. I'll double check that code if I can but sadly this is very low on my priority list. So far I found out I don't have rights to the Quarry box so I pinged some people to get that. I'll try and help if I can.

Reply 14:03, 26 June 2017 6 years ago

Mess (talkcontribs)

OK, thank you for your interest, @Milimetric (WMF). Meanwhile I've read the 2016 Community Wishlist Survey request for the Quarry maintenance (linked by @Achim55) and I've found out this old Phabricator task (https://phabricator.wikimedia.org/T139162), where Yuvipanda had published the SQL code to reset the queue.

Reply Edited 14:25, 26 June 2017 6 years ago

Milimetric (WMF) (talkcontribs)

Ok, I got access to the project and restarted the celery workers (the things that run the queries). Let me know if things don't improve and I can try restarting more stuff.

Reply 18:31, 26 June 2017 6 years ago

Achim55 (talkcontribs)

User:Milimetric (WMF), it isn't better than before. Even my roughly shortened quarry:query/7250 is killed. Btw. I hate this idiotic editor.

Reply Edited 19:37, 26 June 2017 6 years ago

Mess (talkcontribs)

@Milimetric (WMF), maybe it should be better a whole cleaning: queries like , or are still running!

Reply 20:54, 26 June 2017 6 years ago

Milimetric (WMF) (talkcontribs)

I will now reboot the boxes, maybe that will help. Which editor? The quarry code editor?

Reply 16:16, 27 June 2017 6 years ago

Achim55 (talkcontribs)

Oh no, sorry, the editing of this page I had in mind. It differs a lot from wikis' and adding links is not an optimal solution as it isn't made for intuitive use. Dan, thank you for your efforts here!

Reply 16:24, 27 June 2017 6 years ago

Milimetric (WMF) (talkcontribs)

Ok, I rebooted all the boxes, and verified that some queries are running ok. I still see the status of the queries that @Mess pointed out as "Running" but that can't be because the box was rebooted. So that must just be invalid state in the db, those aren't running anymore.

Reply 16:25, 27 June 2017 6 years ago

Milimetric (WMF) (talkcontribs)

@Achim55, yeah, Flow is a work in progress. For me, I'm a newcomer to wikitext so it's much much easier to work with Flow. If you're interested in working on it, I know the people on the team and they're super nice and some of the smartest people I know. Here's some links if you're interested:

browse the code: https://github.com/wikimedia/mediawiki-extensions-Flow

contribute: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/Flow

The team hangs out in IRC under #wikimedia-collaboration

Reply 16:34, 27 June 2017 6 years ago

Mess (talkcontribs)

@Milimetric (WMF), I tried to re-run several times [my query], but nothing happened - and it's the same also for the [other one] re-executed by @Achim55. I'm starting to doubt about the codes of our queries, but they can't be faulty, as they worked regularly in the past.

Reply 22:12, 29 June 2017 6 years ago

Achim55 (talkcontribs)

Confirmed. Doesn't work yet.

Reply 20:35, 1 July 2017 6 years ago

Mess (talkcontribs)

I'm trying also to use the "LIMIT" clause to reduce the output in [this query test], but it's useless. It's like in some particular cases statements as "NOT IN" (or maybe what follows inside the parentheses) influence negatively the whole query (if I remove that, the query test runs without problems). Anyway, I don't know where to bang my head and solve this crazy puzzle.

Reply 22:00, 1 July 2017 6 years ago

Milimetric (WMF) (talkcontribs)

@Mess & @Achim55, I should have looked at your queries earlier. Yes, those NOT IN clauses are very intense to compute over categorylinks and pagelinks tables, those tables have a huge number of records. I would suggest re-writing. Sorry I don't have much time now to help. What I would try to do is re-write the NOT INs as JOINs. This might give the optimizer a better chance to make a plan that runs faster. If that doesn't work, do me a favor and explain in plain English what you're looking for and I can think about it when I have some more time.

Reply 22:16, 7 July 2017 6 years ago

Mess (talkcontribs)

@Milimetric (WMF), you're absolutely right. I've recently created two separated queries ( 20065and 20068) as a temporary solution for my task, even if I've already realized there was something "strange" in the query codes for their excessive time-consuming extraction (that we didn't care in the past simply because they didn't stopped themselves like now). I'll work again on the original code to find an ultimate solution, but probably I can carry on also with this alternative.

Reply 05:37, 8 July 2017 6 years ago

Mess (talkcontribs)

@Milimetric (WMF): HOORAY! I DID IT! I've found what was bad in my query: in practice, the part where it searched for orphaned redirects was the real "point of failure", as it was badly implemented, so I just fixed it with a better code and now the query 18648 not only works again, but also accomplishes its task in only 3 minutes (a great improvement if compared with the 15 minutes it took previously). So, @Achim55, follow the advices above and try to find the part of code that takes too much time to run (I suggest you to split up the query and execute every single piece separately).

Reply 12:52, 9 July 2017 6 years ago

Achim55 (talkcontribs)

Pardon?? "Write better code so you will no longer recognise the malfunction of Quarry." Fine. I think one should fasten the brake a bit more so that some action will become necessary...

Reply 19:14, 10 July 2017 6 years ago

Milimetric (WMF) (talkcontribs)

Well, no matter how amazing Quarry is, a statement like select * from revision will always break it on large enough wikis. No tool is substitute for writing good code. Think of writing bad code like throwing lots of kilowatts of energy in the trash. You wouldn't leave a 100W bulb on all day for no reason, so don't write inefficient SQL :)

Reply 13:06, 20 July 2017 6 years ago

Achim55 (talkcontribs)

Sorry for having been misunderstood: As I pointed out many months ago queries like "SELECT * FROM page" are regularly not killed but stay "running". That's the point. The "killing mechanism" is broken. In the near future we can celebrate an anniversary, I found a query that "runs" for nearly one year now. I don't know what happens in the background, but it's our experience that quarry gets slower the more queries are "running" for more than one hour instead of being killed.

Reply 20:30, 20 July 2017 6 years ago

Milimetric (WMF) (talkcontribs)

Oh, as far as I can tell those queries are killed, the interface just doesn't show that status as "Killed". So there are some bugs but they're not affecting the system performance as far as I can tell. There's a discussion going on about replacing Quarry with a tool developed independent of WMF, so we can take advantage of regular updates. Here's that thread: https://phabricator.wikimedia.org/T169452

Reply 15:06, 26 July 2017 6 years ago

Zhuyifei1999 (talkcontribs)

Queries are killed, but the status updating system may break due to various reasons. See phab:project/board/800/ column "Bugs (Endless Query Run)". Unfortunately many of the old logs are lost so I'm unable to debug them. Please file a ticket if you see any such behaviour.

Reply 23:57, 1 August 2017 6 years ago

Achim55 (talkcontribs)

@Zhuyifei1999, that's fine so far. So let me ask in a different way: Half a year ago needed 10 ... 12 minutes for one run. Today it needs 25...30 minutes. Why is quarry working extremely slowly now?

Reply Edited 20:09, 8 September 2017 6 years ago

Zhuyifei1999 (talkcontribs)

This is not Quarry's problem. The query runs on the wiki replicas and they may get slower due to excess use and more data processing, and for Wiki Replica issues it's best to consult the DBA. However, I may try to optimize the query a bit.

Reply 20:51, 8 September 2017 6 years ago

Zhuyifei1999 (talkcontribs)

Never mind. I'd suggest you to ask the DBAs regarding query optimization or wait till quarry switch to the new Wiki Replica servers some time soon (in next next few weeks/months).

Reply 21:11, 8 September 2017 6 years ago

Zhuyifei1999 (talkcontribs)

Quarry has switched to the new Replica servers yesterday, which is said to be 5x faster, so you may want to try the query again. Note that 'USE stopthisrun' won't work on the new servers (and idk what even are 'stopthisrun')

Reply 19:55, 26 September 2017 6 years ago

Achim55 (talkcontribs)

Thank you very much for notifying. I use non-existing db 'stopthisrun' for quickly killing my query if I see an error after I already had started it.

Reply 20:06, 26 September 2017 6 years ago

Reply to "Queries for orphaned pages don't work"