Topic on Talk:Quarry

RoySmith (talkcontribs)

I'm running select count(*) from comment limit 10; (https://quarry.wmflabs.org/query/39582), which is taking forever. I would think that would be instantaneous.

Quarry claims, "Query status: running", but when I click the Explain button, I get, "Hmm... Is the SQL actually running?!"


Zhuyifei1999 (talkcontribs)

"Hmm... Is the SQL actually running?!" Is a bug. phab:T205214

Why is it show? Because the explain is like:

{
  "rows": [
    [
      1,
      "PRIMARY",
      "comment",
      "index",
      null,
      "comment_hash",
      "4",
      null,
      266953357,
      "Using where; Using index"
    ],
    [
      11,
      "DEPENDENT SUBQUERY",
      "logging",
      "ref",
      "type_time,log_comment_type,log_type_action",
      "log_comment_type",
      "8",
      "enwiki.comment.comment_id",
      185,
      "Using where; Using index"
    ],
    [
      10,
      "DEPENDENT SUBQUERY",
      "revision_comment_temp",
      "ref",
      "PRIMARY,revcomment_rev,revcomment_comment_id",
      "revcomment_comment_id",
      "8",
      "enwiki.comment.comment_id",
      1,
      "Using index"
    ],
    [
      10,
      "DEPENDENT SUBQUERY",
      "revision",
      "eq_ref",
      "PRIMARY",
      "PRIMARY",
      "4",
      "enwiki.revision_comment_temp.revcomment_rev",
      1,
      "Using where"
    ],
    [
      9,
      "DEPENDENT SUBQUERY",
      "recentchanges",
      "ref",
      "rc_comment_deleted",
      "rc_comment_deleted",
      "8",
      "enwiki.comment.comment_id",
      1,
      "Using where; Using index"
    ],
    [
      8,
      "DEPENDENT SUBQUERY",
      "protected_titles",
      "ref",
      "pt_reason_id",
      "pt_reason_id",
      "8",
      "enwiki.comment.comment_id",
      4,
      "Using index"
    ],
    [
      7,
      "DEPENDENT SUBQUERY",
      "oldimage",
      "ref",
      "oi_description_deleted",
      "oi_description_deleted",
      "8",
      "enwiki.comment.comment_id",
      28,
      "Using where; Using index"
    ],
    [
      6,
      "DEPENDENT SUBQUERY",
      "ipblocks",
      "ref",
      "ipb_reason_deleted",
      "ipb_reason_deleted",
      "9",
      "enwiki.comment.comment_id,const",
      32,
      "Using index"
    ],
    [
      5,
      "DEPENDENT SUBQUERY",
      "filearchive",
      "ref",
      "fa_description_deleted",
      "fa_description_deleted",
      "8",
      "enwiki.comment.comment_id",
      124,
      "Using where; Using index"
    ],
    [
      4,
      "DEPENDENT SUBQUERY",
      "filearchive",
      "ref",
      "fa_reason",
      "fa_reason",
      "8",
      "enwiki.comment.comment_id",
      405,
      "Using index"
    ],
    [
      3,
      "DEPENDENT SUBQUERY",
      "image",
      "ref",
      "img_description_id",
      "img_description_id",
      "8",
      "enwiki.comment.comment_id",
      26,
      "Using index"
    ]
  ],
  "headers": [
    "id",
    "select_type",
    "table",
    "type",
    "possible_keys",
    "key",
    "key_len",
    "ref",
    "rows",
    "Extra"
  ]
}
RoySmith (talkcontribs)

How did you get the explain output?

Zhuyifei1999 (talkcontribs)

As for why it is joining other tables, it wants to prevent people from gaining any extra knowledge of comments that may not be publicly viewable from the web interface, such as revision-deleted comments.

The enwiki_p.comment is a view against the database enwiki, with the following definition (SHOW CREATE VIEW comment;):

CREATE ALGORITHM=UNDEFINED DEFINER=`viewmaster`@`%` SQL SECURITY DEFINER VIEW `comment` AS select `enwiki`.`comment`.`comment_id` AS `comment_id`,`enwiki`.`comment`.`comment_hash` AS `comment_hash`,`enwiki`.`comment`.`comment_text` AS `comment_text`,`enwiki`.`comment`.`comment_data` AS `comment_data` from `enwiki`.`comment` where (exists(select 1 from `enwiki`.`image` where (`enwiki`.`image`.`img_description_id` = `enwiki`.`comment`.`comment_id`) limit 1) or exists(select 1 from `enwiki`.`filearchive` where (`enwiki`.`filearchive`.`fa_deleted_reason_id` = `enwiki`.`comment`.`comment_id`) limit 1) or exists(select 1 from `enwiki`.`filearchive` where ((`enwiki`.`filearchive`.`fa_description_id` = `enwiki`.`comment`.`comment_id`) and ((`enwiki`.`filearchive`.`fa_deleted` & 2) = 0)) limit 1) or exists(select 1 from `enwiki`.`ipblocks` where ((`enwiki`.`ipblocks`.`ipb_reason_id` = `enwiki`.`comment`.`comment_id`) and (`enwiki`.`ipblocks`.`ipb_deleted` = 0)) limit 1) or exists(select 1 from `enwiki`.`oldimage` where ((`enwiki`.`oldimage`.`oi_description_id` = `enwiki`.`comment`.`comment_id`) and ((`enwiki`.`oldimage`.`oi_deleted` & 2) = 0)) limit 1) or exists(select 1 from `enwiki`.`protected_titles` where (`enwiki`.`protected_titles`.`pt_reason_id` = `enwiki`.`comment`.`comment_id`) limit 1) or exists(select 1 from `enwiki`.`recentchanges` where ((`enwiki`.`recentchanges`.`rc_comment_id` = `enwiki`.`comment`.`comment_id`) and ((`enwiki`.`recentchanges`.`rc_deleted` & 2) = 0)) limit 1) or exists(select 1 from (`enwiki`.`revision` join `enwiki`.`revision_comment_temp` on((`enwiki`.`revision_comment_temp`.`revcomment_rev` = `enwiki`.`revision`.`rev_id`))) where ((`enwiki`.`revision_comment_temp`.`revcomment_comment_id` = `enwiki`.`comment`.`comment_id`) and ((`enwiki`.`revision`.`rev_deleted` & 2) = 0)) limit 1) or exists(select 1 from `enwiki`.`logging` where ((`enwiki`.`logging`.`log_comment_id` = `enwiki`.`comment`.`comment_id`) and ((`enwiki`.`logging`.`log_deleted` & 2) = 0) and (`enwiki`.`logging`.`log_type` in ('abusefilter','articlefeedbackv5','block','campus','close','contentmodel','course','create','delete','eparticle','gather','gblblock','gblrename','gblrights','globalauth','gwtoolset','import','institution','instructor','interwiki','liquidthreads','lock','managetags','massmessage','merge','moodbar','move','mwoauthconsumer','newsletter','newusers','notifytranslators','online','pagelang','pagetranslation','pagetriage-curation','pagetriage-deletion','patrol','protect','renameuser','review','rights','spamblacklist','stable','student','tag','thanks','timedmediahandler','translationreview','upload','usermerge'))) limit 1))
Zhuyifei1999 (talkcontribs)

> How did you get the explain output?

You just have to try a few times, each time refreshing the page.. Hacky... I know.

BDavis (WMF) (talkcontribs)
RoySmith (talkcontribs)

Hmm. I actually tried that, but apparently with too small a value of "few" :-(

The view thing is ugly, but I get it. Is there any way to cancel a running query? This is still running, but I really don't want to trash the DB over this.

Maybe I should just back up and ask what I'm really trying to do. I want to find all the revisions which have a certain string in the edit comment. I'm assuming I want something long the lines of:


select rev_id from revision, comment

where rev_comment_id = comment_id

and comment_text like '%my certain string%';


That's obviously an expensive query, but I don't have any feel for just how long it would take? Are we talking minutes, hours, days? Will I do bad things to the DB if I try to run that?

BDavis (WMF) (talkcontribs)

Using the comment_revision view instead of the comment view will help a bit, but comment_text like '%my certain string%'; will be really slow and expensive no matter what. Any LIKE match with a leading wildcard (%) will require a full table scan. See 8.3.1 How MySQL Uses Indexes in the MySQL documentation for a bit more detail, but fundamentally MySQL/MariaDB do not support indexes for prefix wildcards.