Extension talk:PureWikiDeletion

Have you considered empty rather than a special string like #DELETED? Ashley Y 08:45, 10 March 2010 (UTC)
 * It's a possibility. Some editors have pointed out that #DELETED makes it clear that the intent is to delete rather than vandalize, but a vandal could just as easily replace all content with #DELETED. It saves keystrokes to use empty too. On the other hand, if we use #DELETED, then blanking is unnecessary; we can just add a #DELETED tag to the beginning of a page and leave the rest of the content, so that people don't have to dig through the history to find what was there. Then again, that possibility also exists with #REDIRECT, but in practice almost everyone replaces all the existing page content with #REDIRECT. Tisane 23:19, 10 March 2010 (UTC)

Blanking should be logged?
The current spec says: The blanking of a page already appears on watchlists and recent changes, so I think this is redundant and unnecessary. However, it would probably be useful to be able to get a list of all blanked pages and filter and sort it by user and time of blanking, as we do with the deletion logs. So, I will create a special page with that functionality, and I figure out a way to filter the blanked pages out of AllPages. (Right now it shows up in AllPages as redlinks.) Tisane 09:24, 11 March 2010 (UTC)
 * The replacing of all content on a page with "#DELETED," or the removal of "#DELETED" from a page should be logged, and listed on watchlists and recent changes like the other logs (the move log, the protection log, etc.)

Are all these features necessary?

 * Blank pages will not show up in searches using Special:Search, random pages using Special:Randompage, or the list of all pages using Special:Allpages.
 * The URL prefix to a blank page should be listed in robots.txt, and the page will emit "noarchive" and "noindex" and "nofollow" tags, to prevent caching in search engines.

I can see why people wouldn't want it to show up in RandomPage or AllPages (although I haven't figured out yet how to implement that). But given that these pages are blanked, won't the external search engines stop directing people there anyway? Also, it still shows up in the internal search as "page title matches," but as a redlink. Maybe it's not so bad for it to continue showing up there. If it's a search term people are looking for a lot, maybe the article should remain there, or at least be a redirect? Maybe it should show up in search accompanied by the edit summary explaining why it was blanked. Tisane 12:42, 13 March 2010 (UTC)

htmlentities
OK, is that good enough to clear up the XSS issues? Tisane 17:56, 13 March 2010 (UTC)

Code review
Since there is some interest in getting this used on Wikipedia, I'll provide some initial code review, though it will still need review from a staff developer before it can be enabled.

setup code
'version' => '0.0.0', If you aren't going to give a useful number, you might as well just leave it blank.

$wgGroupPermissions['*']['PWD']   = false; It isn't necessary to explicitly set it to false, as that's the default. Also, this right doesn't seem to do anything.

function PWDSaveHook
global $wgOut; This isn't used anywhere

require( "extensions/PWD/PWDConfig.php" ); All configuration should be done in LocalSettings

$con = mysql_connect($yourHost,$yourUsername,$yourPassword); You should use the wiki database rather than creating your own, see Manual:Hooks/LoadExtensionSchemaUpdates for how to set up the database using update.php. MediaWiki also has its own classes for dealing with the database. See Manual:Database access

$sql=sprintf("INSERT INTO blanked_pages (blanked_page, blanking_user, blanking_timestamp, After you switch to using MediaWiki's database classes, you should use the wrapper functions for queries, as these automatically handle quoting and escaping

mysql_real_escape_string($article->getTitle)); Article::getTitle returns a Title object, not a string. As written, it should use $article->getTitle->getPrefixedDBkey, but it should really be storing the title and namespace index separately or just the pageid.

function PWDLink
global $wgRRAnonOnly, $wgUser,$wgOut,$wgDBserver,$wgDBuser,$wgDBpassword,$wgTitle, $wgServer, $wgScriptPath;

AFICT, $wgRRAnonOnly doesn't actually exist and none of these are used.

$PWDActivated=true; if ($PWDActivated==false){ $PWDActivated will never be false because you set it to true in the line directly above. Also, you should use either !$PWDActivated or $PWDActivated===false, depending on whether you need type checking as well

if( $target->isKnown ) { return true; } else { return true; } Just "return true" is shorter.

if ( in_array( 'known', $options ) ) { The existence check may not have been done at this point. You should probably use the LinkEnd hook.

function PWDEditHook
$date = date('M d, Y H:i:s', mktime($hour, $minute, $second, $month, $day, $year)); You should pass the timestamp directly to wfTimestamp

$user_url=$wgServer.$wgScriptPath.$URLStructForUserPage.$userPage; Use $wgUser->getSkin->link to make a link. Same for all the other links

$output=" A former version of this page was blanked by ".$user_HTML This should set a message in the i18n file and call it with something like wfMsg

parser function functions
These should call and check $parser->incrementExpensiveFunctionCount since they need to do a database query for each usage.

return htmlentities($param2); You shouldn't escape the output from the parser functions, as it still needs to go through the parser

PWD_body.php
p_ID int NOT NULL AUTO_INCREMENT, This should use the same prefix "blanked_" as the other fields

$sql="CREATE INDEX userind on blanked_pages (blanking_user)"; As far as I can see, "blanked_page" is the only thing used in a WHERE condition, so none of the other fields need an index.

PWDConfig.php
As noted, all configuration should be done in LocalSettings, and it shouldn't use its own database.

date_default_timezone_set ( 'America/Denver' ); This shouldn't be set in an extension unless there's a really good reason

Code review discussion
You may want to look over pages like Manual:Coding conventions. Mr.Z-man 05:02, 14 March 2010 (UTC)
 * Thanks for the code review; I will get to work on these items immediately. Tisane 07:19, 15 March 2010 (UTC)

Known bug
There is a known bug involving the improper use of $wgUser that causes the user and talk pages on the edit screen to be incorrectly linked, but it will be fixed in version 1.0.0. Tisane 02:53, 19 March 2010 (UTC)

Draft code posted
I posted some draft code. Have at it, Kim. If you start doing a major overhaul, you may wish to use the Template:Inuse. And feel free to add yourself to the introductory comments as a co-author. Tisane 03:50, 19 March 2010 (UTC)
 * Hmm, you don't have svn access? --Kim Bruning 21:47, 20 March 2010 (UTC)
 * I applied for it about a week into my programming endeavors, but didn't get a response so I figured I'd wait a little while before making further inquiries. Tisane 07:25, 21 March 2010 (UTC)

More Code Review
Just a point about your SQL schema. Fields within a table should use a consistent prefix; currently some of the fields have "blanked" and some use "blanking". The prefix should also be short; I'd suggest something like "blank". Also, timestamps should be varbinary(14), and ids should be UNSIGNED INT:

Happy ‑ melon 13:14, 19 March 2010 (UTC)

I had to moved the UNSIGNED around to get it to work with phpMyAdmin:

Tisane 12:56, 20 March 2010 (UTC)


 * Groovy. Also, I just noticed the CamelCase in parentID; we prefer either blank_parent_id or just blank_parent (the fact that it's an id is obvious from the type).  Happy ‑ melon 18:45, 20 March 2010 (UTC)
 * Do you notice any security issues, or can MaxSem's XSS notice be taken down? By the way, I'm having a bit of trouble with the wfTimestamp and $wguser->getSkin->link changes suggested above. Do you have some code or documentation you can point me in the direction of? Thanks, Tisane 19:25, 20 March 2010 (UTC)
 * Yes. A user Foo" href="http://evil.site blanking pages pirates the user page links AFAICT, as would blanking a page of the same name.  Far and away the easiest way to cure that is to solve the other major problem, which is hardcoded messages with no possibility for localisation.  As you can see, the parser has no problem safely displaying the hostile link, and will safely display any internal page link.  So if you put your notice text in a sytem message in /PWD.i18n.php as:

A former version of this page was blanked by User:$1 (talk) (contribs) on $2 The reason given for blanking was: $3. You may [ view the article's history], [ edit the last version], or type new page into the white space below.
 * And bung that through the parser, it'll be sure to come out safely (or if it doesn't, there are much bigger problems with MW). And the message can be translated into all three hundred languages MW supports.  And to get that HTML back again you need to call


 * Note that the reason will be escaped by the parser; no need for htmlentities. So you don't really need the link* functions.
 * Other notes:
 * You can parse MW timestamps to Unix with wfTimestamp( TS_UNIX, $timestamp ), no need to use string splitting
 * Use $wgTitle->exists instead of the db call in lines 119-122, 251, 275, and all the doesPageExist functions.
 * What is the function of $whatElse on line 96?
 * Implement styling with CSS, not hardcoded styles. So instead of adding $blankLinkStyle etc, add a class and style it in a /pwd.css (lowercase filenames for outward-facing files).  Note that 'blue' is not the default colour for wikilinks; but you shouldn't need to style ordinary known/broken links, only blank ones.
 * Keep up the good work! Happy ‑ melon 20:57, 20 March 2010 (UTC)

Next round
A few more comments: -- Mr.Z-man 22:11, 20 March 2010 (UTC)
 * In PWDLink, you can special-case (i.e. ignore) things that aren't in editable namespaces so that you don't spend time looking up special pages, Media: links and system messages (these are treated as bluelinks even if they don't exist locally)
 * In PWDEditHook, you should use the user's preferred date format. . This will also adjust it if the user has a different timezeone set.
 * In efPWDParserFunction* the correct way to increment the expensive function count is to call $parser->incrementExpensiveFunctionCount - this will return true or false to say whether to proceed. The parser functions should also probably return something, even if the expensive function count is exceeded. I think #ifexist returns false.
 * This can also special-case special pages and the like as noted above.
 * You should also check that $title isn't null, which would indicate an invalid title
 * You could also eliminate the (direct) query on the page table by getting the pageid with $title->getArticleID. It will return 0 if the article doesn't exist. And if the pageid has already been loaded for some reason before, it may not have to do a DB query.
 * In the DB schema, you don't need to add _ind after all the index names. Unless its an index on multiple columns, you can just use the column name.
 * It also isn't necessary to have an index on columns that will never be used in a WHERE or JOIN clause - these take up unnecessary space and can slow down writes.
 * You may want to split the hook functions out to a _body.php or .hooks.php file and make them static functions of a class. Then you can load the class with $wgAutoloadClasses and call the static functions in the hooks. This is sort of the de facto preferred style for extensions (at least for longer functions).
 * You should also hook on ArticleDeleteComplete to remove deleted pages from the blanked_pages table.