Extension talk:TitleKey

Jump to: navigation, search

About this board

archive of previous talk


By clicking "Add topic", you agree to our Terms of Use and agree to irrevocably release your text under the CC BY-SA 3.0 License and GFDL

Include in Release

3
Summary last edited by Kghbln 20:52, 11 November 2016 22 days ago

Filed as task 38203 "Suggestion to merge TitleKey to the MediaWiki core ".

Bachsau (talkcontribs)

This extension should really be one of those included in the release tarballs, or even better, be merged with the core. --Bachsau (talk) 14:15, 14 April 2012 (UTC)

Mlpearc (talkcontribs)

Agreed, just installed on my project and is working great. Kinda seems like a "no-brainer" to include this in the core.

Felipe Schenone (talkcontribs)

Agree, it would make the default search engine a bit more decent.

Reply to "Include in Release"

Accent and special characters independent search

5
Rbirmann (talkcontribs)

Would it be possible to use this extension to make MW search independent of accents, umlauts and diacritical marks on page titles?

I have a Portuguese installation of MW and users are not being able to find what they are looking for unless the search term has all accents.

What I NEED:

  • Accented page title results for unaccented queries: user should be able to search for unaccented terms and get accented results of page titles, so an article named "Estratégia de atuação" would show up on search results for search queries "estrategia" and "atuacao", for instance.

What I would LIKE (but this is not crucial)

  • Accent-independent search for article content (as well as title)

What I DO NOT NEED at this time:

  • Unaccented results for accented queries
Krinkle (talkcontribs)
  • The normalisation of accents in title search could indeed be handled by the TitleKey extension indeed. I'd recommend creating an issue on bugzilla.wikimedia.org for under "MediaWiki extensions > TitleKey".
  • For content search, you'll need to look in the abilities of the "search backend". This is beyond the scope of the TitleKey extension. The default MySQL search backend will likely not be able to support this. Look into Extension:MWSearch for example, and Lucene search.
  • TitleKey normalises both the query and the index, so it will naturally work in both "directions" from a user point of view.
Rbirmann (talkcontribs)

Apparently bug 20097 is exactly this, but since it's been there for a while, I am not sure anyone is looking into it.

As a "quick and dirty" fix, I used iconv to work around this problem.

I have patched extensions/TitleKey/TitleKey_body.php by changing the 'normalize' function to:


static function normalize( $text ) {
	global $wgContLang;
	setlocale(LC_ALL, 'pt_BR');
	$newtext = iconv('UTF-8', 'ASCII//TRANSLIT', $text);
	return $wgContLang->caseFold( $newtext );
}

With the new file in place I ran TitleKey/rebuildTitleKeys.php and things seem to be working...

Will post updates here if I notice any undesirable side-effects...

Cheers,

Rbirmann (talk) 00:58, 10 October 2013 (UTC)

UPDATE:

This is not a complete fix. Following my previous example, if the article title is "Estratégia de atuação", after this fix searching for "estrategia de atuacao" finds the article, but searching for "estrategia" or "atuacao" does not. It is something, but still far from a fix.

The quest continues...

Wikinaut (talkcontribs)

You wrote:

This is not a complete fix. Following my previous example, if the article title is "Estratégia de atuação", after this fix searching for "estrategia de atuacao" finds the article, but searching for "estrategia" or "atuacao" does not. It is something, but still far from a fix.
The quest continues...

In my view you must apply the normalize function twice:

  1. when generating the table column tk_key in rebuildTitleKeys.php - i.e. when running php rebuildTitleKeys.php
  2. and when actually performing the search (what you do)

so that "translit" input values (substrings while typing) are searched against "translit" database column tk_key entries.

Let me know, if that works, and then perhaps you can send me your code, I am interested in that.

Felipe Schenone (talkcontribs)

I just submitted a working patch based on this talk https://gerrit.wikimedia.org/r/#/c/286580/

Reply to "Accent and special characters independent search"
Jamesmontalvo3 (talkcontribs)

On my production server I do not yet have shell access, so I had to work around this. I wrote the following script to be used over HTTP to add the required MySQL table and populate it with data. This script worked in my environment and is not guaranteed to work on yours. It is certainly not the most robust code possible. MediaWiki/Titlekey developer gurus: please feel free to comment on any erroneous content. Hope this helps!

<?php

/**
 *  
 *  Before you use this script please note that this was created by a person
 *      inexperienced with the inner workings of MediaWiki. It was created to
 *      get around the problem of not having command line access on a 
 *      particular server with a specific MediaWiki load with specific 
 *      extensions. THIS MAY NOT WORK ON YOUR COMPUTER. Anyone who knows more
 *      about MediaWiki please feel free to edit mercilessly.
 *  
 *  I've tried to write this script without any dependencies so it should run
 *      on just about any PHP load. However, this abysmally slower than using
 *      the standard rebuildTitleKeys.php script from the command line. If your
 *      wiki is large this script may not work for you (without vastly 
 *      increasing max_execution_time in php.ini).
 *  
 *  Note on implementation: titlekey.sql says tk_key is "with namespace prefix"
 *      which seems to me that titles should be recorded like with the prefix
 *      like "Template:Citation needed" instead of just "Citation needed".
 *      However, my inspection of the Titlekey extension indicates that prefix
 *		is not included.
 *  
 *  Input your database host name, username, password and database name below.
 *  
 **/

$hostname = "YOUR HOST NAME";
$username = "YOUR USERNAME";
$password = "YOUR PASSWORD";
$db_name  = "YOUR DATABASE NAME";

/** 
 *  No changes should be required below this point...unless I made a mistake.
 *  
 *  USE AT YOUR OWN RISK! BACKUP YOUR WIKI FIRST!
 **/

echo "Connecting to database...";

// Connect to database 
$db = mysql_connect($hostname, $username, $password)
    or die ('Unable to connect. Check you connection parameters.');

// Select particular database
mysql_select_db($db_name, $db)
    or die(mysql_error($db));

echo "connection successful.<br />";
	
// Construct query to retrieve all relevant page information from standard
// MediaWiki table 'page'
$query = "SELECT page_id, page_namespace, page_title FROM page";

echo "Retrieving page information...";

// Perform query to retrieve page information
$result = mysql_query($query, $db) or die (mysql_error($db));

echo "page information retrieval successful.<br />";

// Loop through pages (for each result create associative array $page)
while( $page = mysql_fetch_assoc($result) ) {

    /**
     *  FORMATTING PAGE NAMES SO THEY CAN BE SEARCHED WITH CASE-INSENSITIVITY
     *
     *  Page names from the standard MediaWiki table 'page' must be reformatted
     *      to be searched quickly and easily with case-insensitivity. Three
     *      actions must be performed on the page title:
     *
     *      1) Replace all underscores with spaces using PHP's str_replace
     *      2) Convert all characters to upper case using PHP's strtoupper
     *      3) Always escape characters before inserting into a database
     **/ 
    $titlekey_page_name = mysql_real_escape_string(
        strtoupper(
            str_replace("_", " ", $page['page_title'])
        )
    );

    // Create a string of SQL values to be inserted into the new titlekey
    //      table. String is pushed to $inserts array to be used later.
    $inserts[] = '(' 
        . $page['page_id'] . ', ' 
        . $page['page_namespace'] . ', ' 
        . '"' . $titlekey_page_name . '"'
        . ')';
        
}

// Take $inserts array (array of strings) and glue them together with ", "
//      in between each string.
$inserts = implode(', ', $inserts);

/**
 *  Build the full insert query. The new table 'titlekey' has columns:
 *      1) titlekey.tk_page corresponds to page.page_id
 *      2) titlekey.tk_namespace corresponds to page.page_namespace
 *      3) titlekey.tk_key corresponds to page.page_title
 *          (in all caps, no underscores)
 **/
$insert_query = 
    'INSERT INTO titlekey
        (tk_page, tk_namespace, tk_key)
    VALUES ' . $inserts . ';';
	
// if titlekey table does not yet exist, create it
$create_table_query = 
    'CREATE TABLE IF NOT EXISTS titlekey (
        -- Ref to page_id
        tk_page int unsigned NOT NULL,

        -- Keep a denormalized copy of the namespace for filtering
        tk_namespace int NOT NULL,

        -- Normalized title.
        -- With namespace prefix, case-folded, in space form.
        tk_key varchar(255) binary NOT NULL,
	  
        PRIMARY KEY tk_page (tk_page),
        INDEX name_key (tk_namespace, tk_key)

    );';

echo "Creating table 'titlekey' (if required)...";

// Execute query to add titlekey table
mysql_query($create_table_query, $db) or die (mysql_error($db));

// In case titlekey already existed and had data in it, remove all data prior
//     to repopulating.
mysql_query("TRUNCATE TABLE titlekey;", $db) or die (mysql_error($db));

echo "table created.<br />Inserting pages into 'titlekey' table.";

// Execute query to insert values into new titlekey table
mysql_query($insert_query, $db) or die (mysql_error($db));

// Close database connection
mysql_close();

echo "<br /><br />Update operations complete.";
Nakohdo (talkcontribs)

You might also have a look at http://www.mediawiki.org/wiki/Extension_talk:MaintenanceShell#Workaround_for_Extension_TitleKey for using TitleKey with the MaintenanceShell extension.

hth Frank

128.157.160.13 (talkcontribs)

Yeah I looked into MaintenanceShell, but in general I don't use extensions marked "experimental" on my production server.

This post was posted by 128.157.160.13, but signed as Jamesmontalvo3.

Nakohdo (talkcontribs)

But if the alternative is using a PHP script for direct database access with database credentials in plain text this might be an option to be considered ;-)

Jamesmontalvo3 (talkcontribs)

This is intended as a run-once-and-delete script...as opposed to having MaintenanceShell, an experimental extension which exposes shell-like privileges, running full time on your production server. It's a trade off of whether you trust yourself to scan through my intentionally simplified code or trust the creators of MaintenanceShell to not leave security holes and possible data-corrupting bugs.

Nakohdo (talkcontribs)

I tried another extension which works with MediaWiki 1.20.2, Extension:Maintenance (beta).

As the rebuildTitleKeys.php file doesn't reside in the default /maintenance folder you have to add the following to your LocalSettings.php:

$wgMaintenanceScripts = array(
  'rebuildTitleKeys' => "$IP/extensions/TitleKey/rebuildTitleKeys.php",
);

http://www.mediawiki.org/wiki/Manual:$wgMaintenanceScripts

Then you can add the following section to the Maintenance extension's metadata.ini file:

[rebuildTitleKeys]
enabled = 1

http://www.mediawiki.org/wiki/Extension:Maintenance#Extending_the_list_of_scripts

Then you should be able to run the Title Key update from the Maintenance extension page.

Reply to "Method to setup without shell access"
210.89.56.38 (talkcontribs)

how can i add data whith search text.the is stored into database please help me.

210.89.56.38 (talkcontribs)

add data with search text.The data is in the database.Data is like jobposition,location.

Reply to "Adding data to the search"
173.227.11.163 (talkcontribs)

Edit the TitleKey.sql script to the following:

CREATE TABLE titlekey ( tk_page int NOT NULL PRIMARY KEY, tk_namespace int NOT NULL, tk_key varchar(255) NOT NULL); CREATE INDEX name_key on titlekey(tk_namespace, tk_key);


Otherwise the <wiki>/maintenance/update.php script will fail with systax error at tk_page (it is the tk_page right behind PRIMARY KEY in the original script).

Reply to "SQLite Configuration"

Hack to make it work when you put it in a directory other than $IP/extensions

4
Leucosticte (talkcontribs)

If you put TitleKeys in a directory other than $IP/extensions, it won't be able to create the tables. In fact, it'll crash. So, change the beginning lines of rebuildTitleKeys.php to:

<?php
if ( !defined( 'MEDIAWIKI' ) ) {
        die( 'This file is a MediaWiki extension, it is not a valid entry point' );
}

$IP = getenv( 'MW_INSTALL_PATH' );
if ( $IP === false )
        $IP = dirname( __FILE__ ) . '/../..';

if ( !isset ( $wgTitleKeyMaintenancePath ) ) {
        require_once( "$IP/maintenance/Maintenance.php" );
} else {
        require_once( "$wgTitleKeyMaintenancePath/maintenance/Maintenance.php" );
}

Then, in LocalSettings.php, set that new global variable to your wiki path; e.g. on mine, I use:

$wgTitleKeyMaintenancePath = "/home/stauffenbergssh/en.rationalwikiwikiwiki.org/w/";

I've been told that isset isn't all that safe to use, so if anyone has a better way, please let me know. Thanks.

Krinkle (talkcontribs)

Did you actually try this? Because if it isn't able to locate $IP then where does LocalSettings.php (and wgTitleKeyMaintenancePath) come from?

Leucosticte (talkcontribs)

Well, on a new MediaWiki installation, first I had to comment out the require_once( "$IP/extensions/TitleKey/TitleKey.php" ); and run update.php, and then I had to uncomment it and run update.php again. I guess you're saying, perhaps the hack did nothing, eh? Hard to say; I'd have to run some experiments to isolate what is making it work or not work. I keep my extensions in /home/stauffenbergssh/extensions/ , and it seemed like that was what was messing it up, because it was using the /../.. and giving me error messages with that path in it. We can userfy this thread if it's desirable to not put that suggestion out there. Leucosticte (talk) 06:43, 26 September 2012 (UTC)

Reply to "Hack to make it work when you put it in a directory other than $IP/extensions"

One step further: search suggestions for matching '''any word''' in the title

1
NocNokNeo (talkcontribs)

Would it be possible to take this extension one step further and match not just the beginning of the title but the beginning of any word in the title? Or at least make this optional via a configuration parameter. Thanks! NocNokNeo (talk)

Reply to "One step further: search suggestions for matching '''any word''' in the title"
There are no older topics