Topic on User talk:Matěj Suchánek

ZI Jony (talkcontribs)

To find pages that don't have Bengali label at Wikidata but had bnwiki sitelink.

USE wikidatawiki_p; 
SELECT CONCAT("Q",ips_item_id), CONCAT("Lbn"), CONCAT('"',ips_site_page,'"')
FROM wb_items_per_site
WHERE ips_site_id='bnwiki' AND NOT EXISTS (
  SELECT 1 FROM wb_terms WHERE CONCAT("Q", ips_item_id) = term_full_entity_id AND term_type = "label" AND term_language='bn' LIMIT 1
)
LIMIT 500
;

Now I tried to do with this, but unfortunately the Quarry killed

USE wikidatawiki_p; 
SELECT CONCAT("Q",ips_item_id), CONCAT("Lbn"), CONCAT('"',ips_site_page,'"')
from wb_items_per_site 
where ips_site_id='bnwiki' 
and ips_item_id not in
(
SELECT wbit_item_id
  FROM wbt_item_terms
	INNER JOIN wbt_term_in_lang tt
    ON wbit_term_in_lang_id = tt.wbtl_id
	INNER JOIN wbt_text_in_lang  dd
    ON tt.wbtl_text_in_lang_id = dd.wbxl_id
	INNER JOIN wbt_text  yy
    ON dd.wbxl_text_id = yy.wbx_id
WHERE wbit_item_id = ips_item_id
and dd.wbxl_language = "bn"
and tt.wbtl_type_id = 1
)
LIMIT 500
;
Matěj Suchánek (talkcontribs)

a) Either do:

AND NOT EXISTS (
  SELECT wbit_item_id
  FROM wbt_item_terms
  ...
)

or remove wbit_item_id = ips_item_id

b) Remove

INNER JOIN wbt_text  yy
    ON dd.wbxl_text_id = yy.wbx_id

since you don't use the actual text.

ZI Jony (talkcontribs)

Thanks for response, still same killed. Could you please re-write full code for me.

Matěj Suchánek (talkcontribs)

I suggested two different queries. This is the first one:

USE wikidatawiki_p;
SELECT CONCAT("Q",ips_item_id), "Lbn", CONCAT('"',ips_site_page,'"')
FROM wb_items_per_site 
WHERE ips_site_id='bnwiki' 
AND NOT EXISTS (
  SELECT wbit_item_id FROM wbt_item_terms
  INNER JOIN wbt_term_in_lang
    ON wbtl_id = wbit_term_in_lang_id
  INNER JOIN wbt_text_in_lang
    ON wbxl_id = wbtl_text_in_lang_id
  WHERE wbit_item_id = ips_item_id
  AND wbxl_language = "bn"
  AND wbtl_type_id = 1
)

(LIMIT is up to you.) For the second one, https://tools.wmflabs.org/sql-optimizer reports the same plan, so probably not worth trying.

My bot performs this task each week for every wiki. It scans recentchanges of Wikidata for newly added links and then checks relevant items. So this is an idea for narrowing the query if you don't manage to get any results.

ZI Jony (talkcontribs)

Thanks, max 500 work.