r37973 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r37972‎ | r37973 | r37974 >
Date:19:49, 23 July 2008
Author:simetrical
Status:old
Tags:
Comment:
(bug 8068) New __INDEX__ and __NOINDEX__ magic words allow control of search engine indexing on a per-article basis. Remarks:
* Currently __INDEX__ will override __NOINDEX__ regardless of their relative positions, due to the way things are written. Instead, the last one on the page should win. This should be pretty easy to fix.
* __INDEX__ and __NOINDEX__ override $wgArticleRobotPolicies. This is almost certainly incorrect, but it's not totally obvious how to fix it, because of the way the code is structured. Probably not a big deal, but should probably be fixed at some point.
* Anyone can add and remove the magic words, and there's no config option to disable them. It's not obvious whether this is okay or not. It would be a one-line change to OutputPage.php to have a config option to ignore the magic words, maybe per-namespace or who knows what.
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/MagicWord.php (modified) (history)
  • /trunk/phase3/includes/OutputPage.php (modified) (history)
  • /trunk/phase3/includes/parser/Parser.php (modified) (history)
  • /trunk/phase3/includes/parser/ParserOutput.php (modified) (history)
  • /trunk/phase3/languages/messages/MessagesEn.php (modified) (history)

Diff [purge]

Index: trunk/phase3/languages/messages/MessagesEn.php
@@ -340,6 +340,8 @@
341341 'hiddencat' => array( 1, '__HIDDENCAT__' ),
342342 'pagesincategory' => array( 1, 'PAGESINCATEGORY', 'PAGESINCAT' ),
343343 'pagesize' => array( 1, 'PAGESIZE' ),
 344+ 'index' => array( 1, '__INDEX__' ),
 345+ 'noindex' => array( 1, '__NOINDEX__' ),
344346 );
345347
346348 /**
Index: trunk/phase3/includes/parser/Parser.php
@@ -3380,6 +3380,15 @@
33813381 wfDebug( __METHOD__.": [[MediaWiki:hidden-category-category]] is not a valid title!\n" );
33823382 }
33833383 }
 3384+ # (bug 8068) Allow control over whether robots index a page. FIXME:
 3385+ # __INDEX__ always overrides __NOINDEX__ here! This is not desirable,
 3386+ # the last one on the page should win.
 3387+ if( isset( $this->mDoubleUnderscores['noindex'] ) ) {
 3388+ $this->mOutput->setIndexPolicy( 'noindex' );
 3389+ } elseif( isset( $this->mDoubleUnderscores['index'] ) ) {
 3390+ $this->mOutput->setIndexPolicy( 'index' );
 3391+ }
 3392+
33843393 return $text;
33853394 }
33863395
Index: trunk/phase3/includes/parser/ParserOutput.php
@@ -24,6 +24,7 @@
2525 $mWarnings, # Warning text to be returned to the user. Wikitext formatted, in the key only
2626 $mSections, # Table of contents
2727 $mProperties; # Name/value pairs to be cached in the DB
 28+ private $mIndexPolicy = ''; # 'index' or 'noindex'? Any other value will result in no change.
2829
2930 /**
3031 * Overridden title for display
@@ -69,6 +70,7 @@
7071 function getSubtitle() { return $this->mSubtitle; }
7172 function getOutputHooks() { return (array)$this->mOutputHooks; }
7273 function getWarnings() { return array_keys( $this->mWarnings ); }
 74+ function getIndexPolicy() { return $this->mIndexPolicy; }
7375
7476 function containsOldMagic() { return $this->mContainsOldMagic; }
7577 function setText( $text ) { return wfSetVar( $this->mText, $text ); }
@@ -78,6 +80,7 @@
7981 function setCacheTime( $t ) { return wfSetVar( $this->mCacheTime, $t ); }
8082 function setTitleText( $t ) { return wfSetVar( $this->mTitleText, $t ); }
8183 function setSections( $toc ) { return wfSetVar( $this->mSections, $toc ); }
 84+ function setIndexPolicy( $policy ) { return wfSetVar( $this->mIndexPolicy, $policy ); }
8285
8386 function addCategory( $c, $sort ) { $this->mCategories[$c] = $sort; }
8487 function addLanguageLink( $t ) { $this->mLanguageLinks[] = $t; }
Index: trunk/phase3/includes/MagicWord.php
@@ -105,6 +105,8 @@
106106 'numberofadmins',
107107 'defaultsort',
108108 'pagesincategory',
 109+ 'index',
 110+ 'noindex',
109111 );
110112
111113 /* Array of caching hints for ParserCache */
@@ -153,6 +155,8 @@
154156 'noeditsection',
155157 'newsectionlink',
156158 'hiddencat',
 159+ 'index',
 160+ 'noindex',
157161 );
158162
159163
Index: trunk/phase3/includes/OutputPage.php
@@ -475,6 +475,8 @@
476476 $this->mLanguageLinks += $parserOutput->getLanguageLinks();
477477 $this->addCategoryLinks( $parserOutput->getCategories() );
478478 $this->mNewSectionLink = $parserOutput->getNewSection();
 479+ # FIXME: This probably overrides $wgArticleRobotPolicies, is that wise?
 480+ $this->setIndexPolicy( $parserOutput->getIndexPolicy() );
479481 $this->addKeywords( $parserOutput );
480482 $this->mParseWarnings = $parserOutput->getWarnings();
481483 if ( $parserOutput->getCacheTime() == -1 ) {
Index: trunk/phase3/RELEASE-NOTES
@@ -24,7 +24,8 @@
2525
2626 === New features in 1.14 ===
2727
28 -None yet
 28+* (bug 8068) New __INDEX__ and __NOINDEX__ magic words allow control of search
 29+engine indexing on a per-article basis.
2930
3031 === Bug fixes in 1.14 ===
3132

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r37968Refactor a bit preparatory to fixing bug 8068: rewrite the robot policy stuff...simetrical19:05, 23 July 2008

Status & tagging log