Extension:SpecialMultiUploadViaZip

Special:MultiUploadViaZip is a special page written by User:dme26 that closely resembles the Special:Upload page but accepts a ZIP file rather than target media directly. This ZIP file is decompressed on the server and passed file by file into the existing MediaWiki Special:Upload page. It effects multiple file uploads (aka mass upload, bulk upload, etc).

Because its implementation is so simple (i.e. it's a complete hack!), it does not provide useful feedback about which files from the ZIP file that might not have been successfully accepted. Indeed, when a MultiUploadViaZip invocation is successful, the web-page returned will be that of the final file passed successfully through the MediaWiki Special:Upload page.

I don't generally need this feedback, and can't currently spend the time to learn about the right parts of the Mediawiki internals to fix this, but it is definitely something that would need to be improved were this script to be used widely.

The Special:MultiUploadViaZip page includes a prefix field that prepends the provided string onto all filenames expanded from the ZIP file.

Internally this special page extends the Special:Upload classes to facilitate its script-driven uploads. Some of the fields the Upload form accepts, notably the description field, are passed identically to the Upload form for each file decompressed from the ZIP file. I usually insert a category tag into this description field, so that the collection of uploaded images are grouped in terms of MediaWiki categories.

The usual PHP upload limits will be applied to uploaded ZIP files.

NOTE: There is currently no check against the size of files during the decompression process. This could fairly easily lead to DoS attacks. As Erik Moeller helpfully points out, this could be quite simply defeated by adding a check against the  variable in the code below. I haven't done this yet, but please feel free to do so, or I'll eventually do it - the file-size limit should presumably be the same as one of the existing PHP configuration variables.

Motivation
This approach has many rough edges - my guess is that this function will quickly appear in the core MediaWiki release developed by those with a far better roadmap of MediaWiki internals than I have. I needed the function right away, though, so thought I should provide it anyway for whatever interest or convenience it might raise.

Note that I only run this extension on intranet wikis, and only then after authenticating users. Many aspects of its implementation could be easily and usefully improved.

In the code the marker "dme26" is used to identify code modifications I've made within long blocks of source copied from existing files - in these cases the original class hierarchy did not allow me to selectively override what I needed to.

I recently discovered SpecialUploadLocal (and used its description to structure this page). It strikes me these two approaches could be easily and usefully merged if they haven't been already...

Installation
Place  into your directory. As usual for extensions, place within your  file the line:

require_once('extensions/SpecialMultiUploadViaZip.php');

You will need to ensure your PHP installation includes Zip file functions. On Windows machines one needs to ensure  is appropriately included through your. The Unix managed hosts I've used seem generally to have this support out of the box.

Known issues

 * Output from this special page is lost, and no feedback is provided regarding the success of intermediate file uploads. I use the description field on the Special:MultiUploadViaZip page, passed to each upload form, to add a category tag, and thus can browse the category to check all files were successfully uploaded.

License
License details are in the source files (nothing surprising - GPL of course).

Code
For now the code is only provided here. I'd eyeball this page's change log to ensure that you agree with subsequent edits (i.e. those that appear to add code to email your secret data to random-looking IP addresses, etc).

SpecialMultiUploadViaZip.php
 Version: 0.2 (i.e. horrible hacking)

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2 of the License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

/*

This extension provides a mechanism to upload multiple images into Mediawiki via uploading a ZIP file through an HTML form. This ZIP file is then split into separate files, each of which is fed through the normal Mediawiki upload script.

I wrote it originally with only my person use in mind - i.e. I know it's a horrible hack! It might however still be useful in some way to our community in some way though...

I use the marker "dme26" to identify code modifications I've made within long blocks of source copied from existing files - in these cases the original class hierarchy did not allow me to selectively override what I needed to.

TODO: provide sensible feedback on the progress of uploads.

TODO: ensure that left-over temporary ZIP files would always be removed, should code bugs not perform the included clean-ups.



$wgExtensionFunctions[] = "wfExtensionSpecialMultiUploadViaZip";

function wfExtensionSpecialMultiUploadViaZip { global $wgMessageCache; $wgMessageCache->addMessages(array( 'multiuploadviazip' => 'Multiple file upload via ZIP', 'fromzipfileprefix' => 'Prefix for every uploaded file', 'fromzipdescription' => 'Summary content for every uploaded file' )); SpecialPage::addPage( new SpecialPage( 'MultiUploadViaZip' ) );

/* NOTE: if PHP fails mid-script, temporary zip files may be left behind there are many ways one might fix this. For my particularly simple needs, I sometimes uncomment the following system callout - necessary for me on some of my hosted webspace because the temporary files are not owned by my personal user. */

// system('rm /tmp/mwMultiUploadZIP*'); }

require_once 'SpecialUpload.php';

class UploadZip extends UploadForm {

// overload the constructor to allow us to set the filename, etc. function UploadZip( &$request, $targetFileName, $tmpFileName, $tmpFileSize ) { $this->mDestFile         = $targetFileName;

$this->mIgnoreWarning    = true; $this->mReUpload         = false; $this->mUpload           = true;

$this->mUploadDescription = $request->getText( 'wpUploadDescription' ); $this->mLicense          = $request->getText( 'wpLicense' ); $this->mUploadCopyStatus = $request->getText( 'wpUploadCopyStatus' ); $this->mUploadSource     = $request->getText( 'wpUploadSource' ); $this->mWatchthis        = $request->getBool( 'wpWatchthis' ); wfDebug( "UploadZip: watchthis is: '$this->mWatchthis'\n" );

$this->mAction           = 'submit'; $this->mSessionKey       = false; $this->mUploadTempName = $tmpFileName; $this->mUploadSize    = $tmpFileSize; $this->mOname         = $targetFileName; $this->mUploadError   = false; $this->mSessionKey    = false; $this->mStashed       = false; $this->mRemoveTempFile = false; }

function saveUploadedFile( $saveName, $tempName ) { global $wgOut;

$fname= "SpecialMultiUploadViaZip::saveUploadedFile";

$dest = wfImageDir( $saveName ); $archive = wfImageArchiveDir( $saveName ); if ( !is_dir( $dest ) ) wfMkdirParents( $dest ); if ( !is_dir( $archive ) ) wfMkdirParents( $archive ); $this->mSavedFile = "{$dest}/{$saveName}";

if( is_file( $this->mSavedFile ) ) { $this->mUploadOldVersion = gmdate( 'YmdHis' ). "!{$saveName}"; wfSuppressWarnings; $success = rename( $this->mSavedFile, "${archive}/{$this->mUploadOldVersion}" ); wfRestoreWarnings;

if( ! $success ) { $wgOut->fileRenameError( $this->mSavedFile,         "${archive}/{$this->mUploadOldVersion}" ); return false; }     else wfDebug("$fname: moved file ".$this->mSavedFile." to ${archive}/{$this->mUploadOldVersion}\n"); }   else { $this->mUploadOldVersion = ''; }

wfSuppressWarnings; //dme26 modified this to use move! $success = rename( $tempName, $this->mSavedFile ); wfRestoreWarnings;

if( ! $success ) { $wgOut->fileCopyError( $tempName, $this->mSavedFile ); return false; } else { wfDebug("$fname: wrote tempfile $tempName to ".$this->mSavedFile."\n"); }

chmod( $this->mSavedFile, 0644 ); return true; } }

class UploadZIPForm extends UploadForm { function processUpload {

//dme26: mostly directly copied from UploadForm, but changed // extension detection to only work with ZIP files.

global $wgUser, $wgOut, $wgRequest;

/* Check for PHP error if any, requires php 4.2 or newer */ if ( $this->mUploadError == 1/*UPLOAD_ERR_INI_SIZE*/ ) { $this->mainUploadForm( wfMsgHtml( 'largefileserver' ) ); return; }

/**    * If there was no filename or a zero size given, give up quick. */   if( trim( $this->mOname ) == '' || empty( $this->mUploadSize ) ) { $this->mainUploadForm( wfMsgHtml( 'emptyfile' ) ); return; }   if ( $this->mDestFile ) { $basename = wfBaseName( $this->mDestFile ); } else { $basename = wfBaseName( $this->mOname ); }
 * 1) Chop off any directories in the given filename

/**    * We'll want to blacklist against *any* 'extension', and use * only the final one for the whitelist. */   list( $partname, $ext ) = $this->splitExtensions( $basename ); if( count( $ext ) ) { $finalExt = $ext[count( $ext ) - 1]; } else { $finalExt = ''; }   $fullExt = implode( '.', $ext );

if( count( $ext ) > 1 ) { for( $i = 0; $i < count( $ext ) - 1; $i++ ) $partname .= '.'. $ext[$i]; }
 * 1) If there was more than one "extension", reassemble the base
 * 2) filename to prevent bogus complaints about length

if ( strlen( $partname ) < 3 ) { $this->mainUploadForm( wfMsgHtml( 'minlength' ) ); return; }


 * 1) dme26: Skip creating an Image page in Mediawiki for the ZIP file.

/* Ensure we have a .zip extension */ if( !$this->checkFileExtension( $finalExt, array( 'zip' ) ) ) { return $this->uploadError( wfMsgHtml( 'badfiletype', htmlspecialchars( $fullExt ) ) ); }


 * 1) dme26: skip file extension black-list checking (it has to be a ZIP file)

/**    * Look at the contents of the file; if we can recognize the * type but it's corrupt or data of the wrong type, we should * probably not accept it. */   if( !$this->mStashed ) { $this->checkMacBinary; $veri = $this->verify( $this->mUploadTempName, $finalExt );

if( $veri !== true ) { //it's a wiki error... return $this->uploadError( $veri->toString ); }   }


 * 1) dme26: I removed this out of caution - I don't use it, but I'm not
 * 2) sure existing users of these hooks would be able to run them
 * 3) unmodified?
 * 4)     * Provide an opportunity for extensions to add futher checks
 * 5)    $error = '';
 * 6)    if( !wfRunHooks( 'UploadVerification',
 * 7)         array( $this->mUploadSaveName, $this->mUploadTempName, &$error ) ) ) {
 * 8)      return $this->uploadError( $error );
 * 9)    }
 * 1)      return $this->uploadError( $error );
 * 2)    }


 * 1) dme26: I don't provide interactivity on the individual Image
 * 2) submissions from the ZIP file. Thus I remove the code to provide
 * 3) intermediate warning pages - each of the ZIP file's uploads either
 * 4) succeeds or fails silently.


 * 1) testing:   $wgOut->addHTML(" PHP data: \$this->mUploadSaveName= $this->mUploadSaveName \$this->mUploadTempName= $this->mUploadTempName \$hasBeenMunged= $hasBeenMunged ");

$wgOut->addHTML(" Attempting to read ZIP file contents: ");

$zip = zip_open($this->mUploadTempName); $fileprefix = $wgRequest->getText('wpFilePref');

if ($zip) {

while ($zip_entry = zip_read($zip)) { $targetfname = zip_entry_name($zip_entry); $wgOut->addHTML("Name:             " . $targetfname . "\n"); $tmpfsize = zip_entry_filesize($zip_entry); $wgOut->addHTML("Actual Filesize:   " . $tmpfsize . "\n"); $wgOut->addHTML("Compressed Size:   " . zip_entry_compressedsize($zip_entry) . "\n"); $wgOut->addHTML("Compression Method: " . zip_entry_compressionmethod($zip_entry) . "\n");

if (zip_entry_open($zip, $zip_entry, "r")) { $tmpfname = tempnam("/tmp", "mwMultiUploadZIP-"); $tmpfsize = zip_entry_filesize($zip_entry); $buf = zip_entry_read($zip_entry, $tmpfsize); zip_entry_close($zip_entry);

$handle = fopen($tmpfname, "w"); fwrite($handle, $buf); fclose($handle);

$wgOut->addHTML("Wrote successfully to file: $tmpfname\n");

// do each upload $uploadEach = new UploadZip($wgRequest, $fileprefix.$targetfname, $tmpfname, $tmpfsize ); $uploadEach ->execute; $wgOut->addHTML("Returned from upload form execute call.\n"); unlink($tmpfname); } $wgOut->addHTML("\n"); }     zip_close($zip); }   $wgOut->addHTML(""); }


 * 1) The rest of this file involves material copied from UploadForm that
 * 2) needed to be modified in various ways. Ideally this could be
 * 3) refactored to make better use of class inheritance.

/**  * There's something wrong with this file, not enough to reject it   * totally but we require manual intervention to save it for real. * Stash it away, then present a form asking to confirm or cancel. *  * @param string $warning as HTML * @access private */ function uploadWarning( $warning ) { global $wgOut; global $wgUseCopyrightUpload;

$this->mSessionKey = $this->stashSession; if( !$this->mSessionKey ) { # Couldn't save file; an error has been displayed so let's go. return; }

$wgOut->addHTML( " " . wfMsgHtml( 'uploadwarning' ) . " \n" ); $wgOut->addHTML( "{$warning} \n" );

$save = wfMsgHtml( 'savefile' ); $reupload = wfMsgHtml( 'reupload' ); $iw = wfMsgWikiHtml( 'ignorewarning' ); $reup = wfMsgWikiHtml( 'reuploaddesc' ); $titleObj = Title::makeTitle( NS_SPECIAL, 'MultiUploadViaZip' ); $action = $titleObj->escapeLocalURL( 'action=submit' );

if ( $wgUseCopyrightUpload ) {     $copyright =  "  mUploadCopyStatus ) . "\" />  mUploadSource ) . "\" />  "; } else { $copyright = ""; }

$wgOut->addHTML( "         mSessionKey ) . "\" />    mUploadDescription ) . "\" />    mLicense ) . "\" />    mDestFile ) . "\" />    mWatchthis ) ) . "\" />  {$copyright}   \n" ); }

/**  * Displays the main upload form, optionally with a highlighted * error message up at the top. *  * @param string $msg as HTML * @access private */ function mainUploadForm( $msg='' ) { global $wgOut, $wgUser; global $wgUseCopyrightUpload;

$cols = intval($wgUser->getOption( 'cols' )); $ew = $wgUser->getOption( 'editwidth' ); if ( $ew ) $ew = " style=\"width:100%\""; else $ew = '';

if ( '' != $msg ) { $sub = wfMsgHtml( 'uploaderror' ); $wgOut->addHTML( " {$sub} \n" .       " {$msg} \n" ); }   $wgOut->addHTML( ' ' ); $wgOut->addWikiText( wfMsg( 'uploadtext' ) ); $wgOut->addHTML( ' ' ); $sk = $wgUser->getSkin;

$sourcefilename = wfMsgHtml( 'sourcefilename' ); $destfilename = wfMsgHtml( 'destfilename' ); $fileprefix = wfMsgWikiHtml( 'fromzipfileprefix' ); //dme26 $summary = wfMsgWikiHtml( 'fromzipdescription' ); //dme26

$licenses = new Licenses; $license = wfMsgHtml( 'license' ); $nolicense = wfMsgHtml( 'nolicense' ); $licenseshtml = $licenses->getHtml;

$ulb = wfMsgHtml( 'uploadbtn' );

$titleObj = Title::makeTitle( NS_SPECIAL, 'MultiUploadViaZip' ); $action = $titleObj->escapeLocalURL;

$encDestFile = htmlspecialchars( $this->mDestFile );

$watchChecked = $wgUser->getOption( 'watchdefault' ) ? 'checked="checked"' : '';

$wgOut->addHTML( "   " ); }

}

function wfSpecialMultiUploadViaZip { // Use an UploadZIPForm to acquire a zip file. Then push each file // in the ZIP through an instance of a normal UploadForm.

global $wgRequest; $form = new UploadZIPForm( $wgRequest ); $form->execute; }

?>