Release status: unmaintained
|Implementation||Parser extension, Link markup|
|Description||On onArticleSaveComplete parse the page for other page titles in the wiki and turn that raw text into an internal link.|
|License||GNU General Public License 3.0|
|Translate the PageCrossReference extension if it is available at translatewiki.net|
|Check usage and version matrix.|
The PageCrossReference extension searches an article on a Major Edit save for words that match page_titles within the article's namespace. If a match is found the words are changed into an Internal Link and the next page_title is searched for. There is no distinction between ' ' and '_' separating words.
Text found in the following conditions are ignored:
- The article's own page_title.
- page_titles in other page_namespaces.
- Link tags
- Http tags
- Text between "nowiki" tags
- Text between "pre" tags
- Text between "angled" braces
- Download, extract and place the file(s) in a directory called
- Add the following code at the bottom of your LocalSettings.php:
require_once "$IP/extensions/PageCrossReference/PageCrossReference.php"; // This example adds the page "Black Page" to the BlackList. $wgPageCrossReferenceBlackList = array( 'Black_Page' );
- Configure as required.
- Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.
|$wgPageCrossReferenceMinimumWordLength||2||Choose titles with these or more words|
|$wgPageCrossReferenceSkipHeaders||false||Headers, true = ignore, false = search|
|$wgPageCrossReferenceBlackList||null||Never choose these titles if found, assume underscores for spaces|
PageCrossReference executes on onArticleSaveComplete then checks for a true $revision and false $minoredit before proceeding to parseArticle.
On parseArticle $wgPageCrossReferenceLoop is checked because doEdit invokes onArticleSaveComplete, thus running PageCrossReference twice, so PageCrossReference runs once per article. The article wiki-text is parsed into an array. The page_title foreach loop runs once per page_title, invoking parseContent and passing the text_article_array by reference.
On parseContent each array element is assessed for an existing page_title internal link or a potential one. If an internal link is found or made the for loop ends.
PageCrossReference cycles through EVERY page_title in the page table in a page_namespace! Ten thousand page_titles means the for loop in parseContent runs 10,000 times. This makes PageCrossReference a memory and CPU hog. The hog mass will vary per site. A test with 32 page_titles on a 1.5 MB article took an extra second to save.
001 Raw page_title followed by internal link. For example, suppose "Main Page" is on line seven and "[[Main Page]]" is on line 23. PageCrossReference will convert "Main Page" on line seven to "[[Main Page]]" and end the search for "Main Page" while leaving two internal links. This violates the OneHit principle but a fix, searching through every article twice, would double the work for a rare circumstance. Bug untouched. 002 Subpage. The "/" in a subpage page_title fouls the parser and causes confusion with ratios in multi-worded page_titles. The solution is removing subpage searches from PageCrossReference and words around "/". Bug fixed. 003 Article page_title intersection across namespaces. The Select query filters out the article page_title, therefore articles with the same page_title in other namespaces are ignored. Left for the author to edit. Bug untouched. 004 File Upload. File Upload failed in version 0.6. Added Namespace Validator. Bug fixed. 005 page_title page_title. With page_titles in sequence the space between was lost. Bug fixed.