Extension:Infobox Data Capture

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual - list
Crystal Clear action run.png
Infobox Data Capture

Release status: unknown

Description Tags to enable capture of typed data in infoboxes.
Author(s) sortaSean, eParka
Last version 0.2 (2007.April.10)
MediaWiki 1.9.3
License No license specified
Download Code on page.
Example Code on page.

Check usage (experimental)

Contents

[edit] What can this extension do

Enable typed data storage in a wiki.

Primarily this involves capturing typed data passed to templates, but is more flexible.

[edit] Usage

Creates a parser function called #dataentry with two arguments:

  • Title: The name of the block of data to be stored. (Usually the template name)
  • Key-value pairs: A '|' deliminated list of key-value pairs separated by ';'.
    • validTag (Optional): Following each key value pair, a number 1=valid, 0=invalid. Non integer values and negative values will be treated as a comment. All information is stored, regardless of value of isValid tag.
    • comment (Optional): Following the valid tag, a comment on the valid tag.
    • Example:
|Key1;Value1;1;Value is valid
|Key2;Value2;0;Value is invalid

Data Entry tag example:

{{#dataentry:Data block title
|Key1;Value1;1;Value is valid
|Key2;Value2;0;Value is invalid
}}


[edit] Handling Lists

List of values are handled using the #listsplit parser function: Unlike templates, the infobox data capture works with multiple entries with the same key. To handle deliminated lists passed to templates, the #listsplit function creates a key-value pair entry for each list item.

The #listsplit function takes 5 arguments:

  • key - The attribute name for the value list
  • list - The list of values separated by the separator
  • separator - (Optional) The separator used in the list, defaults to ",".
  • validTag- (Optional) The valid tag to be used for the entire list, defaults to 0.
  • comment - (Optional) The comment to be used for the entire list, defaults to "".
{{#listsplit:key|list|separator|validTag|comment}}

NOTE: Only for use within the dataentry parser function. Generates its own opening '|' (or not if no values), for use directly inside #dataentry. See example below.

[edit] Examples

The following:

{{#listsplit:synonyms|function;foo;foobar;method|;|1|All valid names for a function}}

Produces:

|synonyms;function;1;All valid names for a function
|synonyms;foo;1;All valid names for a function
|synonyms;foobar;1;All valid names for a function
|synonyms;method;1;All valid names for a function

The following example creates 7 records in the database for the "function" data block: one description record, four valid synonym records and two invalid synonym records. It displays nothing on the page:

{{#dataentry:function
|description;A block of code...;2;a simplistic view
{{#listsplit:synonym|function;foo;foobar;method|;|1|All valid names for a function}}
{{#listsplit:synonym|attribute;variable|;|0|Not functions}}
}}

[edit] Installation

Add the database table `infoboxdata`, move the code to the extensions folder and include the code in LocalSettings.php.

[edit] Database Table Addition

Requires one table called {{{tag}}}_infoboxdata.

CREATE TABLE  {{{Schema}}}.`{{{tag}}}_infoboxdata` (
  `ib_from` int(8) unsigned NOT NULL default '0',
  `ib_datablock_order` int(11) NOT NULL default '7',
  `ib_datablock_name` varbinary(255) NOT NULL default '',
  `ib_attribute_order` int(11) NOT NULL default '7',
  `ib_attribute` varbinary(255) NOT NULL default '',
  `ib_value` blob,
  `ib_isvalid` int(1) unsigned NOT NULL default '1',
  `ib_comment` blob,
  KEY `ib_from` (`ib_from`,`ib_datablock_order`,`ib_datablock_name`,`ib_attribute`),
  KEY `ib_datablock_name` (`ib_datablock_name`,`ib_from`)
) ENGINE=MyISAM DEFAULT CHARSET=binary;

[edit] Changes to LocalSettings.php

Add the following line:

require_once("$IP/extensions/InfoboxData/InfoboxData.php");

[edit] Code

<?php
 
if ( !defined( 'MEDIAWIKI' ) ) {
        die( 'This file is a MediaWiki extension, it is not a valid entry point' );
}
$wgExtensionCredits['parserhook'][] = array(
        'name' => 'Infobox Data Capture',
        'version' => '0.2', // April 10, 2007.
        'url' => 'http://www.mediawiki.org/wiki/Extension:Infobox_Data_Capture',
        'author' => 'sortaSean, eParka',
        'description' => 'Enable database capture of typed infobox data',
);
 
$wgInfoboxDataCapture = new InfoboxDataCapture();
 
$wgExtensionFunctions[] = 'wfSetupInfoboxDataCapture';
$wgHooks['LanguageGetMagic'][] = 'wfInfoboxDataLanguageGetMagic';
 
//have to place these hooks outside the setup function, otherwise they don't get called
$wgHooks['ArticleSaveComplete'][] = array(&$wgInfoboxDataCapture, 'save');
$wgHooks['ArticleDeleteComplete'][] = array(&$wgInfoboxDataCapture, 'delete');
 
 
function wfSetupInfoboxDataCapture() {
        global $wgParser;
        global $wgInfoboxDataCapture;
 
        # Set a function hook associating the "example" magic word with our function
        $wgParser->setFunctionHook( 'dataentry', array(&$wgInfoboxDataCapture, 'dataEntryParser' ));
        $wgParser->setFunctionHook( 'listsplit', array(&$wgInfoboxDataCapture, 'listSplit' ));
}
 
function wfInfoboxDataLanguageGetMagic( &$magicWords, $langCode ) {
        # The first array element is case sensitive, in this case it is not case sensitive
 # All remaining elements are synonyms for our parser function
 switch ( $langCode ) {
                default:
                $magicWords['dataentry'] = array( 0, 'dataentry' );
                $magicWords['listsplit'] = array( 0, 'listsplit' );
        }
        # unless we return true, other parser functions extensions won't get loaded.
 return true;
}
 
/**
 * InfoboxDataCapture
 * class for handling dataentry tags and uploading data to the database
 * @package MediaWiki
 */
class InfoboxDataCapture {
 
        /**#@+
         * @private
         */
        # Persistent:
 var $mInfoboxData;
 
 
        /**
         * Constructor, initializes the infobox data array
         *
         * @private
         */
        function InfoboxDataCapture() {
                $this->mInfoboxData = array();
        }
 
        /**
         * Receives the infobox data as input. First element is the title, then key-value pairs separated by ";".
         * Use the listsplit parserfunction to separate lists and store them as individual items.
         * WIKI TEXT EXAMPLE:
         * 
         * {{#dataentry:function
         * |description;A block of code...;2;a simplistic view
         * {{#listsplit:synonyms|function;foo;foobar;method|;|1|All valid names for a function}}
         * {{#listsplit:synonyms|attribute;variable|;|0|Not functions}}
         *      }}
         *
         * @param parser        The parser object
         * @param title         The name of the data block, to be stored in the database.
         * @private
         */
        function dataEntryParser ( &$parser, $title) {         
                //nothing passed in, crap out
                if(!(func_num_args() > 1))
                        return "";
 
                $argString = func_get_arg(2);
                //since there is a varible number of arguments, handle them here
                //and some arguments maybe merged if the are from a nested parser function (i.e. listsplit)
                for($i = 3; $i < func_num_args() ;$i++) {
                        $argString .= "|".func_get_arg($i);
                }
                $argString = $parser->mStripState->unstripBoth($argString);
                $args = explode("|",$argString);
 
                $infoboxData = $this->parseInfoboxData( $args );
                $this->addKeyValuePairs($title, $infoboxData);
                return "";
        }
 
        /**
         * A parser function used to separate lists into multiple value insertions.  
         * Unlike templates, the infoboxdata capture works with multiple entries with the same key.  
         * For use within the dataentry parser function.
         * NOTE: Generates its own opening '|' (or not if no values), for use directly inside #dataentry
         * Called via:
         *              {{#listsplit:synonyms|function;foo;foobar;method|;|1|All valid names for a function}}
         *      
         * As is:
         * {{#dataentry:function
         * |description;A block of code...;2;a simplistic view
         * {{#listsplit:synonyms|function;foo;foobar;method|;|1|All valid names for a function}}
         * {{#listsplit:synonyms|attribute;variable|;|0|Not functions}}
         *      }}
         *
         * @param parser        parser object
         * @param key           The name of the  
         * @param list          A list of values seperated by the separator
         * @param separator The separator used in the list, default ","
         * @param isValid       A valid tag, used for anti-metacrap
         * @param comment       The separator used in the list, default ","
         * @private
         */
        function listSplit( &$parser, $key, $list, $separator = ",", $isValid = 1, $comment = "") {
 
                if(!is_numeric($isValid) ||(intval($isValid) != $isValid) || ($isValid < 0)) {
                        if(!$comment) {
                                $comment = $isValid;
                        }
                        $isValid = 1;
                } 
                $valueList = explode($separator, $list);
                $output = "";
                foreach( $valueList as $value ) {
                        $output .= "|$key;$value;$isValid;$comment\n";
                }
                return $output;
        }
 
        /**
         * Adds each data block to the mInfoboxData hash using the title as a key.  
         * Handles multiple blocks of the same title by pushing onto an array
         *
         * @param title         The name of the data block
         * @param args          The array of "InfoboxDatum"s
         * @private
         */
        function addKeyValuePairs( $title, $args) {
                if ( !isset( $this->mInfoboxData[$title] ) ) {
                        $this->mInfoboxData[$title] = array();
                }
 
                if($args) {
                        //FIXME: should create a datablock object that has a name and a data array
                        $datablock = array();
                        $datablock[$title] = $args;
 
                        array_push ($this->mInfoboxData, $datablock);
                }
 
        }
 
        /**
         * Creates the InfoboxDatum objects.  Handles lack of key by using $index.
         *
         * @param args          An array of key value pairs with valid tag and comment info - all ";" separated
         * @private
         */
        function parseInfoboxData( $args ) {
                $infoboxData = array();
                $index = 1;
                foreach( $args as $arg ) {
                        $values = explode(";", $arg); 
                        if(count($values) == 1) {
                                $values[1] = $values[0];
                                $values[0] = $index++;   
                        } elseif (!$values[0]) {
                                $values[0] = $index++;
                        }
                        if($values[1]) {//call different constructors, FIXME: should be able to call just case 4
                                switch (count($values)) {
                                case 2:
                                        array_push($infoboxData, new InfoboxDatum($values[0], $values[1]));
                                        break;
                                case 3:
                                        array_push($infoboxData, new InfoboxDatum($values[0], $values[1], $values[2]));
                                        break;
                                case 4:
                                        array_push($infoboxData, new InfoboxDatum($values[0], $values[1], $values[2], $values[3]));
                                        break;
                                }
                        }
                } 
                return $infoboxData;
        }
 
        /**
         * Saves the InfoboxData on record save.  Called using the ArticleSaveComplete hook.
         *
         * @param &$article The article object already saved.
         * @param                       Many others... all required by the hook. Not used.
         * @private
         */
        function save(&$article, &$user, &$text, &$summary, &$minoredit, &$watchthis, &$sectionanchor, &$flags) { 
                # Update the links tables
         $ibd = new InfoboxDataUpdate( $article->getTitle(), $this->mInfoboxData );
                $ibd->doUpdate();
                return true;
        }
 
        /**
         * Deletes the InfoboxData on record delete.  Called using the ArticleDeleteComplete hook.
         *
         * @param &$article The article object deleted.
         * @param                       Others... all required by the hook. Not used.   
         * @private
         */
        function delete(&$article, &$user, $reason) {
                $dbw =& wfGetDB( DB_MASTER );
                $dbw->delete( 'infoboxdata', array( 'ib_from' => $article->getID() ) );
                return true;
        }
}
 
/**
 * InfoboxDatum
 * 
 * @package MediaWiki
 */
class InfoboxDatum {
 
        var    $mName,
                        $mValue,
                        $mIsValid,
                        $mComment;
 
        /**
         * Constructor
         *
         * @param name          
         * @param value
         * @param isValid       Optional, default is 1
         * @param comment       Optional
         * @private
         */
        function InfoboxDatum( $name, $value, $isValid = 1, $comment = "") {
                $this->mName = trim($name);
                $this->mValue = trim($value);
                $this->mIsValid = intval($isValid);
                $this->mComment = trim($comment);
 
                if( !is_numeric($isValid) ||($this->mIsValid != $isValid) || ($this->mIsValid < 0)) {
                        $this->mComment = $isValid;
                        $this->mIsValid = 1;
                }
        }
 
        function getName()                   { return $this->mName; }
        function getValue()                  { return $this->mValue; }
        function getValidFlag()              { return $this->mIsValid; }
        function getComment()                { return $this->mComment; }
}
 
/**
 * Modeled after LinksUpdate from 1.9.3 
 * Can't use ParserOutput object because called from ArticleSaveComplete hook
 * should probably refactor to be generic or merge with InfoboxDataCapture class
 * @package MediaWiki
 */
class InfoboxDataUpdate {
 
        /**@{{
         * @private
         */
        var $mId,            //!< Page ID of the article linked from
                $mTitle,         //!< Title object of the article linked from
                $mDb,            //!< Database connection reference
                $mOptions,       //!< SELECT options to be used (array)
                $mInfoboxData;   //!< infobox data to be uploaded into database
        /**@}}*/
 
        /**
         * Constructor
         * Initialize private variables
         * @param title                 Title object
         * @param infoboxData   3-D array holding key value pairs for multiple data blocks
         */
        function InfoboxDataUpdate( $title, $infoboxData) {
                global $wgAntiLockFlags;
 
                if ( $wgAntiLockFlags & ALF_NO_LINK_LOCK ) {
                        $this->mOptions = array();
                } else {
                        $this->mOptions = array( 'FOR UPDATE' );
                }
                $this->mDb =& wfGetDB( DB_MASTER );
 
                $this->mTitle = $title;
                $this->mId = $title->getArticleID();
 
                $this->mInfoboxData = $infoboxData;            
        }
 
        /**
         * Update link tables with outgoing links from an updated article
         */
        function doUpdate() {
                $this->doDumbUpdate();
        }
 
        /**
         * Link update which clears the previous entries and inserts new ones
         * May be slower or faster depending on level of lock contention and write speed of DB
         * Also useful where link table corruption needs to be repaired, e.g. in refreshLinks.php
         */
        function doDumbUpdate() {
                $fname = 'InfoboxData::doDumbUpdate';
                wfProfileIn( $fname );
 
                $this->dumbTableUpdate( 'infoboxdata',  $this->getInfoboxDataInsertions(), 'ib_from' );
                wfProfileOut( $fname );
        }
 
        function dumbTableUpdate( $table, $insertions, $fromField ) {
                $fname = 'InfoboxData::dumbTableUpdate';
                $this->mDb->delete( $table, array( $fromField => $this->mId ), $fname );
                if ( count( $insertions ) ) {
                        $this->mDb->insert( $table, $insertions, $fname, array( 'IGNORE' ) );
                }
        }
 
        /**
         * Get an array of data from dataentry tags insertions. Like getLinkInsertions()
         * @private
         */
        function getInfoboxDataInsertions( $existing = array() ) {
                wfProfileIn( __METHOD__ );
                $arr = array();
                foreach( $this->mInfoboxData as $blockOrder => $datablock ) {
                        foreach ( $datablock as $name => $attributes ) {
                                foreach ( $attributes as $attributeOrder => $attribute ) {
                                        $arr[] = array(
                                                'ib_from'                     => $this->mId,
                                                'ib_datablock_order'=> $blockOrder,
                                                'ib_datablock_name'   => $name,
                                                'ib_attribute_order'=> $attributeOrder,
                                                'ib_attribute'                => $attribute->getName(),
                                                'ib_isvalid'                  => $attribute->getValidFlag(),
                                                'ib_value'                    => $attribute->getValue(),
                                                'ib_comment'                  => $attribute->getComment()
                                        );
                                }
                        }
                }
                wfProfileOut( __METHOD__ );
                return $arr;
        }
}
?>


[edit] Fix hook error

For PHP version >= 5.3.0 on the error use the following patch:

PHP Warning: Hook InfoboxDataCapture::save has invalid call signature; Parameter 3 to InfoboxDataCapture::save() expected to be a reference, value is given in ...

To fix this, change in InfoboxData.php:

 function save(&$article, &$user, &$text, &$summary, &$minoredit, &$watchthis, &$sectionanchor, &$flags) {

to

 function save(&$article, &$user, $text, $summary, $minoredit, $watchthis, $sectionanchor, &$flags) {

[edit] See also

Personal tools
Namespaces
Variants
Actions
Site
Support
Download
Development
Communication
Print/export
Toolbox