Wikidata - Wikisource Integration Modules

From mediawiki.org

Overview and Background[edit]

The following documentation helps you to deploy various MediaWiki modules and configure a bot, that will help your Wikisource to retrieve a book's metadata from existing data on Wikidata and display it on your Wikisource. For example, this index page on Punjabi Wikisource, is displaying the title, author, translator, publisher, address, and year, information from its respective Wikidata item. The following the pages and their respective functions;

Modules
A significant part of the modules was written by Tpt from French Wikisource, and further improvements made by Bodhisattwa from Bengali Wikisource, and Tshrinivasan as part of the WikiCite Project Grant.
Bot
  • User:WD-WS Integration Bot: While the above modules retrieve data from Wikidata and display it on the index pages, they work only after the respective Wikidata QIDs are added to the index page form, which has to be done manually. The bot helps to automate the process of adding Wikidata QIDs to the index pages, to an extent. With the help of the index page, the bot traces the main pages of books and then their linked Wikidata items.
The bot has been programmed by Tshrinivasan as part of the WikiCite Project Grant.

Implementation[edit]

Instructions video on using the documentation.

Note Note: Please post a message on the talk page if you need help with deploying the modules on your Wikisource.

Modules[edit]

You will need PLACHOLDER rights on your wiki to deploy the modules.

Step 1: Proofreadpage index data config[edit]

TODO:

{
    "Type": {
        "type": "string",
        "size": 1,
        "default": "book",
        "label": "Type",
        "header": true,
        "values": {
            "book": "Book",
            "journal": "Journal",
            "collection": "Collection",
            "phdthesis": "Phdthesis",
            "dictionary": "Dictionary",
            "film": "Film",
            "audio": "Audio"
        },
        "help": "Select the type of the book",
        "data": "type"
    },
    "wikidata_item": {
        "type": "wikibase-itemid",
        "size": 1,
        "default": "",
        "label": "Wikidata Item",
        "header": true,
        "data": "wikibase-itemid"
    },

Step 2: Proofreadpage index template[edit]

TODO: Please copy only the highlighted line of code from the drop down below and add them to your Wikisource's "MediaWiki:Proofreadpage index template", the URL would be langcode.wikisource.org/wiki/MediaWiki:Proofreadpage_index_template.
For example, https://bn.wikisource.org/wiki/āĻŽāĻŋāĻĄāĻŋāĻ¯āĻŧāĻžāĻ‰āĻ‡āĻ•āĻŋ:Proofreadpage_index_template.

{{#invoke:Index template|indexTemplate
|type={{{Type|}}}
|wikidata_item={{{wikidata_item|}}}
|title={{{Title}}}
|subtitle={{{Subtitle|}}}
|volume={{{Volume|}}}
|edition={{{Edition|}}}
|author={{{Author}}}
|translator={{{Translator}}}
|editor={{{Editor}}}
|illustrator={{{Illustrator|}}}
|publisher={{{Publisher}}}
|address={{{Address|}}}
|printer={{{Printer|}}}
|year={{{Year|}}}
|source={{{Source|}}}
|image={{{Image|}}}
|progress={{{Progress|}}}
|pages={{{Pages|}}}
|volumes={{{Volumes}}}
|remarks={{{Remarks}}}
|notes={{{Notes|}}}
}}

Step 3: Index data[edit]

TODO:

  • Please copy the entire code from the drop down below.
  • Please create a new page "Module:Index data" on your Wikisource, the URL would be langcode.wikisource.org/wiki/Module:Index_data.
    For example, https://pa.wikisource.org/wiki/Module:Index_data.
  • Please pay attention to highlighted text, and the comment above the line for instructions regarding translation and customization.
local wikidataTypeToIndexType = {
	['Q3331189'] = 'book',
	['Q1238720'] = 'journal',
	['Q28869365'] = 'journal',
	['Q191067'] = 'journal',
	['Q23622'] = 'dictionary',
	['Q187685'] = 'phdthesis'
}

local indexToWikidata = {
    ['subtitle'] =  'P1680',
    ['volume'] = 'P478',
    ['edition'] = 'P393',   
    ['author'] = 'P253075',
    ['translator'] = 'P655',
    ['editor'] = 'P123',
    ['illustrator'] = 'P110',
    ['publisher'] = 'P760',
    ['printer'] = 'P872',
    ['address'] = 'P291',
    ['publishedin'] = 'P253129',
    ['year'] = 'P766',
    ['parts'] = 'P253130',
}

function indexDataWithWikidata(frame)
	local args = {}
	for k,v in pairs(frame.args) do
		if v ~= '' then
			args[k] = v
		end
	end
	
	local item = nil
	if args.wikidata_item then
		item = mw.wikibase.getEntity(args.wikidata_item)
		if item == nil then
			mw.addWarning('The Wikidata entity identifier [[d:' .. args.wikidata_item .. '|' .. args.wikidata_item .. ']] put in the "Wikidata entity" parameter of the Book page: does not seem valid.') 
		end
	end
	if not item then
		return {
			['args'] = args,
			['item'] = nil
		}
	end

	if not args.type then
		for _, statement in pairs(item:getBestStatements('P31')) do
			if statement.mainsnak.datavalue ~= nil then
				local typeId = statement.mainsnak.datavalue.value
				if wikidataTypeToIndexType[typeId] then
					args.type = wikidataTypeToIndexType[typeId] 
				end
			end
		end
	end
	
	if not args.image then
		for _, statement in pairs(item:getBestStatements('P18')) do
			if statement.mainsnak.datavalue.value ~= nil then
				args.image = statement.mainsnak.datavalue.value
			end
		end
	end
	
	if not args.title then
		local value = item:formatStatements('P1476')['value'] or ''
		if value == '' then
			value = item:getLabel() or ''
		end
		if value ~= '' then
			local siteLink =  item:getSitelink()
			if siteLink then
				value = '[[' .. siteLink .. '|' .. value .. ']]'
			end
--Please translate the text "View and edit data on Wikidata" into your language.
			args.title = value .. ' [[File:OOjs UI icon edit-ltr.svg|View and edit data on Wikidata|10px|baseline|class=noviewer|link=d:' .. item.id .. '#P1476]]'
		end
	end

    if not args.year then
		for _, statement in pairs(item:getBestStatements('P577')) do
			if statement.mainsnak.datavalue ~= nil then
				local current_year = statement.mainsnak.datavalue.value.time
                args['year'] = mw.ustring.sub(current_year, 2, 5)
			end
		end
    end

	for arg, propertyId in pairs(indexToWikidata) do
		if not args[arg] then
			local value = item:formatStatements(propertyId)["value"]
			if value ~= '' then
				args[arg] = value 
			end
		end
	end
	

	return {
		['args'] = args,
		['item'] = item
	}
end

local p = {}
 
function p.indexDataWithWikidata(frame)
    return indexDataWithWikidata(frame)
end
 
return p

Step 4: Index template[edit]

TODO:

  • Please copy the entire code from the drop down below
  • Please create a new page "Module:Index template" on your Wikisource, the URL would be langcode.wikisource.org/wiki/Module:Index_template.
    For example, https://pa.wikisource.org/wiki/Module:Index_template.
  • Please pay attention to highlighted text and the comment above the line for instructions regarding translation and customization.
  • Please create required categories after the modules are deployed, if they do not already exist.
function withWikidataLink(wikitext, category)
	if wikitext == nil then
		return nil
	end
	new_wikitext = mw.ustring.gsub(wikitext, '%[%[([^|%]]*)%]%]', function(page)
		return addWikidataToLink(page, mw.ustring.gsub(page, '%.*/', '') , category)
	end)
	if new_wikitext ~= wikitext then
		return new_wikitext
	end
	return mw.ustring.gsub(wikitext, '%[%[([^|]*)|([^|%]]*)%]%]', function(page, link)
		return addWikidataToLink(page, link, category)
	end)
end

function addWikidataToLink(page, label, category)
    local title = mw.title.new( page )
    if title == nil then
    	return '[[' .. page .. '|' .. label .. ']]'
    end
    if title.isRedirect then
        title = title.redirectTarget
    end

    local tag = mw.html.create('span')
    local itemId = mw.wikibase.getEntityIdForTitle(title.fullText)
	tag:wikitext('[[' .. page .. '|' .. label .. ']]')
    if itemId ~= nil then
    	--transalate "View information on Wikidata"
    	tag:wikitext(' [[Image:Wikidata.svg|10px|link=d:' .. itemId .. '|View information on Wikidata]]')
    	if category ~= nil then
    		tag:wikitext('[[Category:' .. category .. ']]')
    	end
    end
    return tostring(tag)
end

function addRow(metadataTable, key, value)
	if value then
		metadataTable:tag('tr')
			:tag('th')
				:attr('score', 'row')
				:css('vertical-align', 'top')
				:wikitext(key)
				:done()
			:tag('td'):wikitext(value)
	end
end

function splitFileNameInFileAndPage(title)
    local slashPosition = string.find(title.text, "/")
    if slashPosition == nil then
    	return title.text,nil
    else
    	return string.sub(title.text, 1, slashPosition - 1), string.sub(title.text, slashPosition + 1)
    end
end

function indexTemplate(frame)
	local data = (require 'Module:Index_data').indexDataWithWikidata(frame)
	local args = data.args
	local item = data.item
	
	local page = mw.title.getCurrentTitle()
	local html = mw.html.create()
	
	--Translate "Books with a Wikidata ID" and "Books without a Wikidata ID"
	if item then
		html:wikitext('[[Category:Books with a Wikidata ID]]<indicator name="wikidata">[[File:Wikidata.svg|20px|element Wikidata|link=d:' .. item.id .. ']]</indicator>')
        else
        html:wikitext('[[Category:Books without a Wikidata ID]]')
	end

    local left = html:tag('div')
    if args.remarks or args.notes then
    	left:css('width', '53%')
    end
    left:css('float', 'left')
    if args.image then
        local imageContainer = left:tag('div')
            :css({
                float = 'left',
                overflow = 'hidden',
                border = 'thin grey solid'
            })
        local imageTitle = nil
        if tonumber(args.image) ~= nil then
            imageTitle = mw.title.getCurrentTitle():subPageTitle(args.image)
        else
            imageTitle = mw.title.new(args.image, "Media")
        end
        if imageTitle == nil then
            imageContainer:wikitext(args.image)
        else
            local imageName, imagePage = splitFileNameInFileAndPage(imageTitle)
            if imagePage ~= nil then
	            imageContainer:wikitext('[[File:' .. imageName .. '|page=' .. imagePage .. '|250px]]')
	        else
	            imageContainer:wikitext('[[File:' .. imageName .. '|250px]]')
	        end
        end
    end
    local metadataContainer = left:tag('div')
    if args.image then
    	metadataContainer:css('margin-left', '150px')
    end
    local metadataTable = metadataContainer:tag('table')

    if args.title then
       if item then
    		addRow(metadataTable, 'Title', withWikidataLink(args.title))
		else 
    		addRow(metadataTable, 'Title', '[[' .. args.title .. ']]')
    	end
    else
    	--Translate "You must enter the title field of the form"
    	mw.addWarning('You must enter the title field of the form.')
    end

	addRow(metadataTable, 'Subtitle', withWikidataLink(args.subtitle))

	--Translate "Books with volume" and "Books without volume"
    if args.volume then
        addRow(metadataTable, 'Volume', '[[' .. args.volume .. ']]' )
        html:wikitext('[[Category:Books with volume]]')
    else 
        html:wikitext('[[Category:Books without volume]]')
    end

	--Translate "Books with edition" and "Books without edition"
    if args.edition then
        addRow(metadataTable, 'Edition', '[[' .. args.edition .. ']]')
        html:wikitext('[[Category:Books with edition]]')
    else 
        html:wikitext('[[Category:Books without edition]]')
    end

	--Translate "Books with author", "Books without Author", and "Books by"
    if args.author then
    if item then
        addRow(metadataTable, 'Author', withWikidataLink(args.author))
        html:wikitext('[[Category:Books with author]]')
        local authors = item:formatPropertyValues( 'P253075', { mw.wikibase.entity.claimRanks.RANK_NORMAL } )['value']
        for author in string.gmatch(authors, '([^,]+)') do
   
    	html:wikitext('[[Category:Books by ' .. author  ..  ']]')
		end
        

    else 
    addRow(metadataTable, 'Author', '{{Al|' .. args.author  .. '}}')
    end
    else
        html:wikitext('[[Category:Books without author]]')
    end

	--Translate "Books with translator" and "Books without translator"
    if args.translator then
    if item then
        addRow(metadataTable, 'Translator', withWikidataLink(args.translator))
        html:wikitext('[[Category:Books with translator]]')
    else 
    addRow(metadataTable, 'Translator', '{{Al|' .. args.translator .. '}}')
    end
    else
        html:wikitext('[[Category:Books without translator]]')
    end

	--Translate "Book with editor" and "Books without editor"
    if args.editor then
    if item then
        addRow(metadataTable, 'Editor', withWikidataLink(args.editor))
        html:wikitext('[[Category:Books with editor]]')
    else
    addRow(metadataTable, 'Editor', '{{Al|' .. args.editor .. '}}')
    end
    else 
        html:wikitext('[[Category:Books without editor]]')
    end
    
    --Translate "Books with illustrator" and "Books without illustrator"
    if args.illustrator then
        addRow(metadataTable, 'Illustrator', withWikidataLink(args.illustrator))
        html:wikitext('[[Category:Books with illustrator]]')
    else 
        html:wikitext('[[Category:Books without illustrator]]')
    end

	--Translate "Books with publisher" and "Books without publisher"
    if args.publisher then
    if item then
        addRow(metadataTable, 'Publisher', withWikidataLink(args.publisher))
        html:wikitext('[[Category:Books with publisher]]')
    else 
        addRow(metadataTable, 'Publisher', withWikidataLink(args.publisher))
        html:wikitext('[[Category:Books with publisher]]')
    end
    else
            html:wikitext('[[Category:Books with No Publisher]]')

    end

	--Translate "Books with place of publication" and "Books without place of publication"
    if args.address then
        addRow(metadataTable, 'Address', withWikidataLink(args.address))
        html:wikitext('[[Category:Books with Place of Publication]]')
    else 
        if args.publishedin then
            addRow(metadataTable, 'Published In', withWikidataLink(args.publishedin))
            html:wikitext('[[Category:Books with Place of Publication]]')
          else 
            html:wikitext('[[Category:Books without Place of Publication]]')
        end
    end

	--Translate "Books with year", "Books without year", and "Books published in"
    if args.year then
        addRow(metadataTable, 'Year', withWikidataLink(args.year))
        html:wikitext('[[Category:Books with year]]')
        html:wikitext('[[Category:Books published in ' ..args.year..']]')

    else 
        html:wikitext('[[Category:Books without year]]')
    end

	--Translate "Books with printer" and Books without printer
    if args.printer then
        addRow(metadataTable, 'Printer', withWikidataLink(args.printer))
        html:wikitext('[[Category:Books with printer]]')
    else 
        html:wikitext('[[Category:Books without printer]]')
    end


    if args.source == 'djvu' or args.source == 'pdf' then
		addRow(metadataTable, 'Source', '[[:File:' .. mw.title.getCurrentTitle().text .. '|' .. args.source .. ']]')

        local query = 'SELECT ?item ?itemLabel ?pages ?page WHERE {\n  ?item wdt:P996 <http://commons.wikimedia.org/wiki/Special:FilePath/' .. mw.uri.encode(mw.title.getCurrentTitle().text, 'PATH') .. '> .\n  OPTIONAL { ?page schema:about ?item ; schema:isPartOf <https://bn.wikisource.org/> . }\n  OPTIONAL { ?item wdt:P304 ?pages . }\n  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],bn".\n}}'
		--Translate "Wikidata items"
        html:wikitext('<indicator name="index-scan-wikidata">[[File:Wikidata Query Service Favicon.svg|20px|Wikidata items|link=https://query.wikidata.org/embed.html#' .. mw.uri.encode(query, 'PATH') .. ']]</indicator>')
	else
		addRow(metadataTable, 'Source', args.source)
    end
	--Replace the following the proofread status categories with the ones currently being used on your Wikisource.
	if args.progress == 'T' then
		addRow(metadataTable, 'Progress', '[[Category:Completed Books]] [[:Category:Completed Books | Completed]]')
	elseif args.progress == 'V' then
		addRow(metadataTable, 'Progress', '[[category:Books to validate]] [[:Category:Books to validate | To validate]]')
	elseif args.progress == 'C' then
		addRow(metadataTable, 'Progress', '[[category:Books to correct]] [[:category:Books to correct | To correct]]')
	elseif args.progress == 'OCR' then
		addRow(metadataTable, 'Progress', '[[category:Books without a text layer]] [[:category:Books without a text layer | Add an OCR text layer]]')
	elseif args.progress == 'L' then
		addRow(metadataTable, 'Progress', '[[category:Books to repair]] <span style = "color: # FF0000;"> [[:Category:Books to repair|Defective source file]]</span>')
   elseif args.progress == 'X' then
		addRow(metadataTable, 'Progress', '[[category:Extracts and compilations]] [[:category:Extracts and compilations | Incomplete source:extract or compilation]]')
	else
		addRow(metadataTable, 'Progress', '[[Category:Unknown progress books]] [[:category:Unknown progress books | Unknown progress]]')
	end
	addRow(metadataTable, 'Series', args.volumes)
	
	if args.pages then
		left:tag('div'):css('clear', 'both')
            left:tag('h3'):wikitext('Pages')
		left:tag('div'):attr('id', 'pagelist'):css({
			background = '#F0F0F0',
			['padding-left'] = '0.5em',
			['text-align'] = 'justify'
		}):newline():wikitext(args.pages):newline()
	else
		mw.addWarning('You must enter the pagination of the facsimile (Pages field) ')
	end

	if args.remarks or args.notes then
		local right = html:tag('div'):css({
			width = '44%;',
			['padding-left'] = '1em',
			float = 'right'
		})
		if args.remarks then
			right:tag('div'):attr('id', 'remarks'):wikitext(args.remarks)
		end
		if args.notes then
			right:tag('hr'):css({
				['margin-top'] = '1em',
				['margin-bottom'] = '1em'
			})
			right:tag('div'):attr('id', 'notes'):wikitext(args.notes)
		end
	end
	
	--Please translate or replace the following type categories.
	if args.type == 'book' then
		html:wikitext('[[Category:Index - Books]] ')
	elseif args.type == 'journal' then
		html:wikitext('[[Category:Index - Periodicals]] ')
	 elseif args.type == 'collection' then
		html:wikitext('[[Category:Index - Collections]] ')
	elseif args.type == 'dictionary' then
		html:wikitext('[[Category:Index - Dictionaries]] ')
	elseif args.type == 'phdthesis' then
		html:wikitext('[[Category:Index - Theses]] ')
	end
	html:wikitext('[[Category:Index]] ')

	if args.source ~= 'djvu' then
		html:wikitext('[[Category:Non djvu book]] ')
	elseif args.source == 'pdf' then
		html: wikitext ('[[Category:PDF book]]') 
	elseif args.source == 'ogg' then
		html: wikitext ('[[Category:OGG file]]') 
	elseif args.source == 'webm' then
		html: wikitext ('[[Category:webm file]]') 
	end
	if not args.remarks then
		html: wikitext ('[[Category:Indexed pages]]') 
	end

	return tostring(html)
end

local p = {}
 
function p.indexTemplate( frame )
    return indexTemplate( frame )
end
 
return p

Bot[edit]

Please contact the bot operator to get the bot running on your Wikisource. Please keep the Category:Books without Wikidata ID handy before you contact.