Extension:BrokenLinks

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual
Crystal Clear action run.png
BrokenLinks

Release status: unmaintained

Implementation Special page
Description Special Page which checks all links in table _externallinks and reports on those that return an HTTP response 4xx or 5xx
Author(s) Gary Thompson (sushigurutalk)
Latest version 0.1.0 (09/06/2009)
MediaWiki 1.14.0
License GPL
Download No link
Parameters

$wgUseAjax=true;

Added rights

restrict to sysops

Translate the BrokenLinks extension if it is available at translatewiki.net

Check usage and version matrix; code metrics
  • PHP (Hypertext Preprocessor, originally stood for Personal Home

Page) is one of the most commonly used scripting programming languages in the world. Scripting language is a high-level programming language for writing scripts (algorithms). PHP was designed by Rasmus Lerdorf in 1995, and was destined for developing Web applications.

  • Nowadays PHP

focuses mainly on server-side scripting. PHP is applied for creating dynamic websites as well and is maintained by the majority of hosting providers. The PHP language and its interpreter (usually implemented as a web server's native application) are being developed by the PHP group within the framework of the project with open-source software (accessible for studying, changing and distributing).

  • PHP

language is so popular in the web programming due to its simplicity, fast perfomance, wide functionality, cross-platform computer software (works on multiple operating systems) and open source codes.

  • It`s

also widespread in the websites developing because of a big quantity of intrinsic tools, that help to work with web applications.

  • According

to the rating of 2013 PHP ranks at the fifth place among the programming languages. As of January 2013, PHP was installed on more than 240 million websites and 2.1 million web-servers.

  • PHP also can be used for creating GUI-applications (Graphical user interface).
  • Filename extensions of PHP: .php, .phtml, .php4, .php3, .php5, .phps.

News[edit | edit source]

  • 2009-06-09: Released version 0.1

Compatibility[edit | edit source]

Tested on our release of MediaWiki 1.14.0 only.

On MediaWiki 1.18.1 you have to make a little change in BrokenLinks.php (about line 59)

Instead of

 $html = new OutputPage(); #this is going to be our object to add html to; allows us to use the nice OutputPage functionality

you have to write

 $context = RequestContext::getMain();
 $html = $context->getOutput(); #this is going to be our object to add html to; allows us to use the nice OutputPage functionality

Otherwise you will get this error:

Catchable fatal error: Argument 1 passed to ContextSource::setContext() must implement interface IContextSource, null given, called in /path/to/your/wiki/includes/OutputPage.php on line 228 and defined in /path/to/your/wiki/includes/RequestContext.php on line 348

Usage[edit | edit source]

Install, as per the instructions below, then go to your wiki/Special:BrokenLinks. Select the number of errors you wish to report on and click the button.

Since the script simply steps through each URL in turn the response may take some time, as each URL will have a timeout to observe.

Download instructions[edit | edit source]

Please cut and paste the code found below and place it in $IP/extensions/BrokenLinks/.

  • Note #1: $IP stands for the root directory of your MediaWiki installation, the same directory that holds LocalSettings.php.
  • Note #2: You must provide a link to an Ajax loading image in the .js file that you create (BrokenLinks.js) in order for this extension to work. For example: var ajax_loader = '<img src="/images/ajax-loader.gif" alt="ajax loader image">';

Installation[edit | edit source]

To install this extension, add the following to LocalSettings.php:

require_once( "$IP/extensions/BrokenLinks/BrokenLinks.php" );
$wgUseAjax = true;

Code[edit | edit source]

extensions/BrokenLinks/BrokenLinks.php[edit | edit source]

<?php
/*
 * Main file for the BrokenLinks extension of MediaWiki.
 * This code is released under the GNU General Public License.
 *
 * Purpose:
 * Special Page which checks all links in table _externallinks and reports on those that return
 * an HTTP response 4xx or 5xx
 
 * Usage:
 * require_once("extensions/BrokenLinks/BrokenLinks.php"); in LocalSettings.php
 * 
 * @package MediaWiki
 * @link http://www.mediawiki.org/wiki/Extension:DynamicPageList   Documentation
 * @license http://opensource.org/licenses/gpl-license.php GNU Public License
 * @version 0.1.0
 * Inital release
*/
 
/*
 * Register the extension with MediaWiki
*/
 
# Alert the user that this is not a valid entry point to MediaWiki if they try to access the special pages file directly.
if (!defined('MEDIAWIKI')) {
   print"To install my extension, put the following line in LocalSettings.php:<br />\n";
   print'require_once( "$IP/extensions/BrokenLinks/BrokenLinks.php" );<br />';
   print'Also ensure that $wgUseAjax=true; is added to LocalSettings.php to enable AJAX support.';
   exit( 1 );
}
 
$wgAjaxExportList[] = 'getBrokenLinks';
 
$wgExtensionCredits['specialpage'][] = array(
 'name' => 'BrokenLinks',
 'author' => 'Gary Thompson, University of St Andrews',
 'url' => 'http://www.mediawiki.org/wiki/Extension:BrokenLinks',
 'description' => 'Create a Special Page which checks all links in table _externallinks and reports on those that return an HTTP response 4xx or 5xx',
 'descriptionmsg' => 'Create a Special Page which checks all links in table _externallinks and reports on those that return an HTTP response 4xx or 5xx',
 'version' => '0.0.1'
);
 
$wgAutoloadClasses['BrokenLinks'] = dirname(__FILE__) . '/BrokenLinks.body.php';	# Tell MediaWiki to load the extension body.
$wgExtensionMessagesFiles['BrokenLinks'] = dirname(__FILE__) . '/BrokenLinks.i18n.php';	# (non) international settings
$wgSpecialPages['BrokenLinks'] = 'BrokenLinks'; 								# Let MediaWiki know we exist

function getBrokenLinks($error_lim){
 
	$fails = array(
          400=>'Bad Request',401=>'Unauthorized',402=>'Payment Required',404=>'Not Found',405=>'Method Not Allowed',406=>'Not Acceptable',
          407=>'Proxy Authentication Required',408=>'Request Timeout',409=>'Conflict',410=>'Gone',411=>'Length Required',412=>'Precondition Failed',
          413=>'Request Entity Too Large',414=>'Request-URI Too Long',415=>'Unsupported Media Type',416=>'Requested Range Not Satisfiable',
          417=>'Expectation Failed',500=>'Internal Server Error',501=>'Not Implemented',502=>'Bad Gateway',503=>'Service Unavailable',
          504=>'Gateway Timeout', 505=>'HTTP Version Not Supported'
          ); #list of server responses likely to mean we can't get through at all, leading to upset users.
	
	$allowable_protocols = array('http','https'); #all we really care about

	$html = new OutputPage(); #this is going to be our object to add html to; allows us to use the nice OutputPage functionality

	$dbr = wfGetDB( DB_SLAVE ); # create an instance to the database - read only
	$page = $dbr->tableName( 'page' );
	$externallinks = $dbr->tableName( 'externallinks' );
 
	$sql = "SELECT count(*) AS max_links FROM $externallinks";
	$res = $dbr->query( $sql );
	$row = $dbr->fetchRow( $res );
	$max_links = $row['max_links'];
 
	$error_limit = (!$error_lim) ? $max_links : $error_lim; #set the theoretical upper limit of links to check
	
	$html->addHTML("<h3>Error limit set at: $error_limit</h3>");
 
	$sql = 	"SELECT page_namespace AS namespace, page_title AS title, el_to AS url
		FROM $page,	$externallinks
		WHERE page_id=el_from
		GROUP BY el_to";
 
	$res = $dbr->query( $sql ); # run the SQL query

	$error_count = 0;
 
	$html->addHTML('<ol>');
 
	while ( $row = $dbr->fetchObject( $res ) ) {
		if($error_count >= $error_limit){
			break;	
		}
		$url = $row->url;
		$title = $row->title;
		$t = Title::newFromText($title); #get article object to play with

		# check to see if we can access the file at this URL
		$URLInfo = array();
		$url_parsed = true; #setup for a fail...
		$this_protocol = explode('://',$url);
		if(in_array(strtolower($this_protocol[0]),$allowable_protocols)){
			$URLInfo = @parse_url($url) or $url_parsed = false;
			if($url_parsed==false){ #FAIL!
				$html->addHTML("<li>Can't parse URL - this is a serious FAIL.  Probably a very badly formed URL with a typo."); #don't die - just raise the message
				$html->addHTML($html->addWikiText("[" . $t->getFullURL( 'action=edit' ) . " $title] has the url $url") . "</li>");
				$error_count++;
			}else{
				$host = $URLInfo['host'];
				$DocumentPath = (isset($URLInfo['path'])) ? $URLInfo['path'] : "/";
				if (isset($URLInfo['query'])){
					$DocumentPath = $DocumentPath."?".$URLInfo['query'];
				}
				$conn = @fsockopen($host, 80, $errno, $errstr, 1.0); 
				if ($conn){ 
					fwrite ($conn, "HEAD ".$DocumentPath." HTTP/1.0\r\nHost: $host\r\n\r\n"); 
					$response= fgets($conn,13);
					$status = substr($response,-3);
					if (@array_key_exists($status,$fails)) { #FAIL!
						$html->addHTML("<li>ERROR::{$fails[$status]}");
						$html->addHTML($html->addWikiText("[" . $t->getFullURL( 'action=edit' ) . " $title] has the url $url") . "</li>");
						$error_count++;
					}
					fclose($conn); 
				}else{ #FAIL!
					$html->addHTML("<li>ERROR::Cannot Connect.  Not even a little bit.");
					$html->addHTML($html->addWikiText("[" . $t->getFullURL( 'action=edit' ) . " $title] has the url $url") . "</li>");
					$error_count++;
				}
			}
		}
	}
	$html->addHTML('</ol>');
	$dbr->freeResult( $res );
	# extract our lovely formatted HTML from the html object and send it back to the AJAX request.
	return $html->getHTML();
}

extensions/BrokenLinks/BrokenLinks.body.php[edit | edit source]

<?php
class BrokenLinks extends SpecialPage {
 
        function __construct() {
 
                parent::__construct( 'BrokenLinks', 'editinterface' );
                wfLoadExtensionMessages('BrokenLinks');
 
        }
 
        function execute( $par ) {
 
                global $wgRequest, $wgOut, $wgUseAjax, $wgJsMimeType, $wgScriptPath, $wgUser;
				if ( !$this->userCanExecute($wgUser) ) {
                $this->displayRestrictionError();
                return;
				}else{
                if (!$wgUseAjax) {
                        $wgOut->addWikiText('wfAjaxlink: $wgUseAjax is not enabled, aborting extension setup.');
                        return;
                }else{
                        $wgOut->addScript("<script type=\"{$wgJsMimeType}\" src=\"{$wgScriptPath}/extensions/BrokenLinks/BrokenLinks.js\"></script>\n" );
                }
 
                $this->setHeaders();
 
                # setup some limits for users 
                 $opts = '';
                $limits = array(1=>'1',5=>'5',10=>'10',20=>'20',50=>'50',100=>'100',0=>'Show All');
                foreach($limits as $key=>$val){
 
                        $opts .= "<option value=\"$key\">$val</option>\n";
 
                }
                $select = "<select name=\"limit\">$opts</select>";
 
                #setup the form for them to use
         $wgOut->addHTML("<form method=\"post\" action=\"\">");
                $wgOut->addHTML("<p>Limit number of errors: $select<input type=\"button\" value=\"Get Results\" onclick=\"getLinks(limit.value);\"></p>");
                $wgOut->addHTML("<p>Be aware that any more than 10 and you'll need to go pop the kettle on.</p>");
                $wgOut->addHTML("</form>");
                $wgOut->addHTML("<div id=\"divBrokenLinks\"></div>");
				}
        }
}

extensions/BrokenLinks/BrokenLinks.i18n.php[edit | edit source]

<?php
$messages = array();
 
$messages['en'] = array( 
	'brokenlinks' => 'Broken Links',
);

extensions/BrokenLinks/BrokenLinks.js[edit | edit source]

function getLinks(lim){
 
	var ajax_loader = '<img src="'''path to your ajax progress bar image'''" alt="ajax loader image">';
 
	// show a nice ajax loader to show *something* is happening (due to server responses, this page can take a while to load
	document.getElementById('divBrokenLinks').innerHTML = ajax_loader;
 
	//now initialise the ajax call, sending response to divBrokenLinks
	sajax_do_call( "getBrokenLinks", [lim] , document.getElementById('divBrokenLinks'));
 
	return;
 
}

That's the lot.

See also[edit | edit source]