Manual talk:DeleteOldRevisions.php
Add topic| This page used the Structured Discussions extension to give structured discussions. It has since been converted to wikitext, so the content and history here are only an approximation of what was actually displayed at the time these comments were made. |
Option to keep certain number of revisions
[edit]Hi,
I'm a bit digging and our wiki is having pages with a huge number of revisions. But I don't want to remove all revisions (not needed to keep everything). What I would like is an option to keep a certain amount of revisions, given as a parameter f.e. 5. So when deleting revisions from the revision-table the number of revisions for a certain page should be taken into account. If a page has 5 or less revisions none will be removed. If a page has more than 5 revisions, all older revisions will be removed except the most recent 5. I've copied DeleteOldRevisions.php to DeleteOldRevisions_Keep.php and am working on modifying it, but it's a touch job so it seems.
I'm progressing: the query
mysql> select rev_id from revision where rev_page=5591 order by rev_id desc limit 5;
+--------+
| rev_id |
+--------+
| 37402 |
| 37401 |
| 37400 |
| 37399 |
| 37398 |
+--------+
5 rows in set (0.00 sec)
mysql> select rev_id from revision where rev_page=5592 order by rev_id desc limit 5;
+--------+
| rev_id |
+--------+
| 37295 |
| 37294 |
| 37293 |
+--------+
3 rows in set (0.00 sec)
rev_page 5591 has 27 revisions and rev_page 5592 has 3 revisions.
Now I was wondering what will happen if I undo the latest revision for page 5591 to revert it back to 37401. Fortunately this gives me a new revision 37406, which gives me the clue that I can use above query to clean up everything except for the latest 5 revisions. DikkieDick (talk) 12:45, 21 March 2017 (UTC)
- After some testing it's finished:
- Code:
- <?php
- /**
- * Delete old revisions from the database and keep the latest 'N' revisions (default 10)
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
- * http://www.gnu.org/copyleft/gpl.html
- *
- * @file
- * @ingroup Maintenance
- * @author Dick Pluim <dick.pluim@gmail.com>
- * (Based on deleteOldRevisions.php by Rob Church)
- */
- require_once __DIR__ . '/Maintenance.php';
- /**
- * Maintenance script that deletes old revisions from the database and keep the latest 'N' revisions (default 10).
- *
- * @ingroup Maintenance
- */
- class DeleteOldRevisions extends Maintenance {
- public function __construct() {
- parent::__construct();
- $this->addDescription( 'Delete old revisions from the database and keep the latest N revisions (default 10)' );
- $this->addOption( 'delete', 'Actually perform the deletion' );
- $this->addOption( 'page_id', 'List of page ids to work on', false );
- }
- public function execute() {
- $this->output( "Delete old revisions\n\n" );
- $this->doDelete( $this->hasOption( 'delete' ), $this->mArgs );
- }
- function doDelete( $delete = false, $args = [] ) {
- # Data should come off the master, wrapped in a transaction
- $dbw = $this->getDB( DB_MASTER );
- $this->beginTransaction( $dbw, __METHOD__ );
- $revConds = "";
- $keepRevs = [];
- $keepLimit = 10; # default
- # If a parameter is given, we assume that this is the number of revisions to keep.
- # only first argument is being used.
- if ( count( $args ) > 0 ) {
- $keepLimit=$args[0];
- $this->output( "Keeping " . $keepLimit . " revisions\n" );
- }
- # make the pagelist
- $res = $dbw->select( 'page', 'page_id', 'page_id>0', array( 'ORDER BY' => 'page_id ASC' ));
- foreach ( $res as $row ) {
- $revConds = "rev_page = $row->page_id order by rev_id desc limit $keepLimit" ;
- # make the list of revisions we want to keep for this page
- $res2 = $dbw->select ( 'revision', 'rev_id' , $revConds, __METHOD__);
- foreach ( $res2 as $row2 ) {
- $keepRevs[] = $row2->rev_id ;
- }
- }
- # Make the list of revisions which will be deleted
- $revConds = 'rev_id NOT IN (' . $dbw->makeList( $keepRevs ) . ')';
- $res = $dbw->select( 'revision', 'rev_id', $revConds, __METHOD__ );
- $oldRevs = [];
- foreach ( $res as $row ) {
- $oldRevs[] = $row->rev_id;
- }
- $this->output( "done.\n" );
- # Inform the user of what we're going to do
- $count = count( $oldRevs );
- $this->output( "$count old revisions found.\n" );
- # Delete as appropriate
- if ( $delete && $count>0 ) {
- $this->output( "Deleting..." );
- $dbw->delete( 'revision', [ 'rev_id' => $oldRevs ], __METHOD__ );
- $this->output( "done.\n" );
- }
- # This bit's done
- # Purge redundant text records
- $this->commitTransaction( $dbw, __METHOD__ );
- if ( $delete ) {
- $this->purgeRedundantText( true );
- }
- }
- }
- $maintClass = "DeleteOldRevisions";
- require_once RUN_MAINTENANCE_IF_MAIN;
- --------------
- Output:
- [root@server maintenance]# php deleteOldRevisions_keep.php
- Delete old revisions
- Keeping 10 revisions
- PHP Notice: Array to string conversion in /u01/mediawiki/tst/includes/db/Database.php on line 808
- done.
- 2534 old revisions found.
- [root@server maintenance]# php deleteOldRevisions_keep.php 5
- Delete old revisions
- Keeping 5 revisions
- PHP Notice: Array to string conversion in /u01/mediawiki/tst/includes/db/Database.php on line 808
- done.
- 6103 old revisions found.
- [root@server maintenance]# php deleteOldRevisions_keep.php --delete 15
- Delete old revisions
- Keeping 15 revisions
- PHP Notice: Array to string conversion in /u01/mediawiki/tst/includes/db/Database.php on line 808
- done.
- 2026 old revisions found.
- Deleting...done.
- Searching for active text records in revisions table...done.
- Searching for active text records in archive table...done.
- Searching for inactive text records...done.
- 2024 inactive items found.
- Deleting...done.
- Tested with first 50 and then going slightly further down... ;-)
- Can't figure out why I get the PHP Notice above. And there is sometimes a mismatch between old revisions found and inactive items found, but it's working in my test-environment.
- Running it a second time:
- [root@server maintenance]# php deleteOldRevisions_keep.php --delete 15
- Delete old revisions
- Keeping 15 revisions
- PHP Notice: Array to string conversion in /u01/mediawiki/tst/includes/db/Database.php on line 808
- done.
- 0 old revisions found.
- Searching for active text records in revisions table...done.
- Searching for active text records in archive table...done.
- Searching for inactive text records...done.
- 0 inactive items found. DikkieDick (talk) 07:18, 23 March 2017 (UTC)
- Hi Dick,
- your option is a great addition to the script! It would be great, if you could create an issue in phabricator and put it into review so that it can be added to the MediaWiki tarball so that everyone can benefit from it! 2001:16B8:10D2:A900:497:48D0:EC41:2B7B (talk) 20:14, 6 January 2018 (UTC)
Option to remove only "minor edits" ?
[edit]Hi,
I'm also trying to get a good compromise between a radical removal of history and storing lots of useless information. But what would be the best according to me, would be to be able to remove all the old "minor edits" in the history. Unfortunately, my coding skills are not sufficient for that... If someone has an idea...
Thanks Pseudomino (talk) 19:36, 18 November 2017 (UTC)
- An option to only remove all edits, which are marked as "minor" does not exist currently. Integrating such an option will cause problems:
- First of all, it will break things like the calculation of size differences between revisions, if the referenced revision suddenly no longer is there. While this only is a technical issue, which maybe can be solved, there is another, way bigger problem:
- An option to only delete minor edits will remove some edits from the history, but not others. Features like the history function of MediaWiki rely on the fact that all revisions stay in place. They compare revisions with each other and display the difference. However, if a revision in between has been removed, then the difference will also include the changes made in that removed revision. That means that changes will be attributed to a user, although it is not clear whether it was really him, who made them.
- This is a very bad situation, which might even cause legal trouble, e.g. if part of an edit contains insults and with the according revision removed it looks like these insults come from user A, while they in fact have been added in a removed revision by user B. 2001:16B8:10D2:A900:497:48D0:EC41:2B7B (talk) 20:05, 6 January 2018 (UTC)
What does "old" mean?
[edit]The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
I don't understand when a revision is defined as old.
Which revisions get deleted exactly? Maybe all but the current one? Or can "old" be defined somewhere as for example "30 days"? Berot3 (talk) 13:05, 18 January 2021 (UTC)
- The text states "to delete all old (non-current) revisions" so I'd say all but the latest revision of a page no matter at what time it was done. [[kgh]] (talk) 15:20, 18 January 2021 (UTC)
- thanks, well I will simply test than on a through-away page :D Berot3 (talk) 16:22, 18 January 2021 (UTC)
- Good. HOwever since you are a new MediaWiki user I am not sure why you need to reduce the size of the database. Personally I would only do it if I really have an issue. [[kgh]] (talk) 16:48, 18 January 2021 (UTC)
- yeah thanks, I think I simply confused "shrinking db" with "getting rid of the history of a page that is displayed for a page".
- From what I saw it is only possible to delete history-entries of a page but than they still appear as greyed- and crossed-out. I thought that it might be possible to simply remove them entirely. Berot3 (talk) 14:17, 21 January 2021 (UTC)
- Not sure if a wiki is the best thing for you to choose. Having a version history is one of the core features of a wiki. Not having is is like cutting off arms and legs of the software I believe. However things could get philosophical discussing this further. [[kgh]] (talk) 16:46, 21 January 2021 (UTC)
- No, you are absolutely right. I’m used to have god-like rights as a admin, but having a wiki with such strong commitment to history makes Mediawiki even more powerful and beautiful as a wiki!
- i understand now, thank you. Berot3 (talk) 20:58, 21 January 2021 (UTC)
how to get page ID (add info/link)
[edit]The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
It might be good for beginners like me to add a link to Page information to inform readers where to find the page ID Berot3 (talk) 16:30, 18 January 2021 (UTC)
- I would do it myself, but I'm not sure where and how to put it with translation and stuff... Berot3 (talk) 16:33, 18 January 2021 (UTC)
- Done. [[kgh]] (talk) 16:47, 18 January 2021 (UTC)
Parse error when running deleteOldRevisions.php
[edit]MediaWiki 1.43.1
Parse error: syntax error, unexpected '=' in /home/clients/xxxxxxxxxxxxxxxxx/web/includes/BootstrapHelperFunctions.php on line 35
Line 35 has ??= operator.
Any clue ?
Thx Alex1859 (talk) 12:20, 27 August 2025 (UTC)
- solved
- the new way is :
- php maintenance/run.php deleteOldRevisions --delete ~2025-46767-2 (talk) 17:43, 27 August 2025 (UTC)
Doesn't work
[edit]This script doesn't work as it should. It only removes rows from the revision table, but leaves all old text in place. Reading its source code, looks like it had not been updated to work with Multi-Content Revisions. ~2025-29300-34 (talk) 06:40, 19 October 2025 (UTC)
Any warning when used on an Wiki with previous concatenated compressOld database?
[edit]Maintenance script compressOld.php with "-t concat" saves all gzipped texts in the text record of the oldest revision.
I'm curious whether this script ever issues a warning when executed on a database or records that had a run of this maintenance script? Currently I'm working with a wiki that is plagued with RevisionAccessExceptions: "Failed to load data blob from Bad data in text row xxx. Use findBadBlobs.php to remedy."
The cause is, when looking at the raw data, obviously because compressOld.php has been run with -t concat (and maybe also $wgCompressRevisions activated for some time - I couldn't tell), and then DeleteOldRevisions.php was executed on these revisions. This causes the non-deleted, most recent revision to point at the oldest revision containing the concatenated texts, except that the oldest revision is gone.--WhichBrain (talk) 04:39, 1 April 2026 (UTC)