VERP/GSOC Progress Report

GSoC Project Wrap Up report
We ( Jeff Green, legoktm and myself ) could deploy the bouncehandler extension in the beta labs instance ( deployment.wikimedia.beta.wmflabs.org ) and it now generates VERPed emails ( Demo - http://deployment.wikimedia.beta.wmflabs.org/wiki/Special:EmailUser ) within the pencil down date. Our implementation strategy was like : Due to various and consistent help from various staff and volunteers from Wikimedia, other than my mentors, specially Mark, Hoo, Bryan, Nemo_bis - we could get the extension deployed safely in beta. The deployment process was really tedious, as we met with new scenarios over there, which was coped up easily.
 * Deploy bouncehandler to beta, and enable only VERP generation.
 * Route VERP-bounces to bouncehandler API in beta by exim change ( https://gerrit.wikimedia.org/r/#/c/155753/ ).
 * Enable bounce handling and let the extension run in logging only mode for a month, so that no fast un-subscription happens.
 * Enable the user-email-invalidate function after observing a threshold of bounce frequency.
 * Implement the same to the production

We could write two separate extensions for MediaWiki: My mentors foresaw that that project would take more time, so we started almost one month before our project was selected which helped us getting the extension deployed in time. I also thank Nemo_bis for pointing the VERP project to me few months before, and to my mentors who volunteered to help.
 * BounceHandler: The MediaWiki extension to generate VERPed email return-paths on every send email and holds the 'bouncehandler' API that can handle an incoming email bounce. The extension is currently in beta.
 * SwiftMailer: An alternate mailer for MediaWiki.

GSoC Project Progress Report
Community Bonding Period

I was active in the Wikimedia community since November 2013, so I concentrated on getting more feedback on the project -- specially the shift from PHP/ Pear mailer to third party software - Swift mailer. The shift requires a lot of consience and I kept on getting feedback, through Wikitech-I and the bugzilla report. Details of the mails are attached below. I had already started building the project environment by March 2014 with help of Jeff and legoktm, so was polishing on its working -- porting it to wikitech labs ( under project name: mediawiki-verp ).

Please follow my blog at :- http://tttwrites.wordpress.com/ to watch regular updates on my GSoC Project

March 2014

 * 1) Built local instance of Wikimedia Email-Web server model.(https://www.mediawiki.org/wiki/VERP/MicroTasks )
 * 2) Model consisted of two virtualboxes, Box1 (runing exim4 and MW core) sending emails via Box2 (running postfix).

April 2014

 * 1) Modified test environment: box1 has MW running -> sends the email -> intercepted by box2 -> routed to Box2 /var/mail/root. Box2 has external connection via NAT.
 * 2) Box2 rejects the mail, Box1 exim produce the bounce to wiki@wikimedia.org in /var/mail/wiki

Week 1 and 2: May 1 to May 19
Week 3 : May 19 - May 26
 * 1) Shifted the above local instance to WikiTech labs ( under project mediawiki-verp ). box1verpnop sends the mail having box2verpnop as the smarthost which rejects all mails to wiki@wikimedia.org. The bounce is created by box1 in the /var/mail/root of box1verpnop.
 * 2) Discussion on shifting from PHP mailer to Swift mailer.
 * 3) Overcome the SSH slow response time by mosh-ing into the Wikitech Server.
 * 1) Working on shifting the default PEAR mailer in mediawiki to Third Party - SwiftMailer.
 * 2) Submitted patch in gerrit (https://gerrit.wikimedia.org/r/#/c/135290/)
 * 3) Asked review for the shift at ([])

June 2014
Week 4 and 5 : May 27 - June 7 Week 6 : June 9 - June 15 Week 7 : June 16 - June 22 
 * 1) Working on improving https://gerrit.wikimedia.org/r/#/c/135290/, now needs to pass Jenkins tests
 * 2) Got private repo for swiftmailer created at https://github.com/wikimedia/mediawiki-core-vendor-swiftmailer
 * 3) Got swiftmailer code hosted at https://github.com/wikimedia/mediawiki-core-vendor-swiftmailer/tree/5.2.0-patch by reporting bug, https://bugzilla.wikimedia.org/show_bug.cgi?id=66110
 * 4) Added Composer installation for Development clusters from vendore/swiftmailer, by patch https://gerrit.wikimedia.org/r/#/c/137538/
 * 5) On the networking side, made a return mail to a specific address handled by a pipe transport, hosted in Wikitech-labs, Project:- mediawiki-verp, machine - box1verpnop
 * 6) Blog posts on Forwarding mails to PHP script - http://tttwrites.wordpress.com/2014/06/07/forwarding-mails-to-a-php-script-with-exim4/
 * 7) Blog posts on Exim configs ( http://tttwrites.wordpress.com/category/technical/exim/)
 * 8) Blog posts on Composer ( http://tttwrites.wordpress.com/category/technical/composer/)
 * 9) Blog posts on creating release tag from an existing tag - used to prepare the SwiftMailer repo ( http://tttwrites.wordpress.com/2014/06/06/composer-loading-specific-tags-instead-of-branches/ )
 * 1) Started implementing VERP ( https://gerrit.wikimedia.org/r/#/c/138655/ )
 * 2) Creating a new Extension - BounceHandler - will be uploaded into gerrit soon. Now available at ( https://github.com/tonythomas01/BounceHandler )
 * 3) Installed IMAP server dovecot, getting script to fetch and evaluate bounces into hard/soft bounce ( https://github.com/tonythomas01/exim4box1verp/blob/handlingBHM/bhm/script.php )
 * 4) Requested for new repo at: mediawiki/extensions/BounceHandler
 * 5) Blog post on IMAP installation and configuration (http://tttwrites.wordpress.com/2014/06/13/using-imap-with-dovecot-in-wikitech-labs-instance/)
 * 1) Submitted various patch-sets to build up the Bounce handler extension:
 * 2) * https://gerrit.wikimedia.org/r/#/c/139767/, https://gerrit.wikimedia.org/r/#/c/140077/, https://gerrit.wikimedia.org/r/#/c/140082/, https://gerrit.wikimedia.org/r/#/c/140085/
 * 3) * Still to be merged:- https://gerrit.wikimedia.org/r/#/c/140330/ ( Added VERP message decoding ), https://gerrit.wikimedia.org/r/#/c/138655/ ( Adding VERP hook to core ).

Week 8,9 : June 23 - July 5
Week 10 : July 6 - 13 2014 and more at :- http://tttwrites.wordpress.com/category/wikimedia/
 * 1) Got merged :-
 * 2) * The unsubscribing part - https://gerrit.wikimedia.org/r/#/c/142786/
 * 3) * Handle bounces part - https://gerrit.wikimedia.org/r/#/c/140330/
 * 4) * Finished the coding part of the BounceHandler extension - now deploy part
 * 5) Started writing new extension - SwiftMailer to add an alternate for UserMailer
 * 6) * https://gerrit.wikimedia.org/r/#/c/143004/ ( Still to be merged )
 * 7) Still to be merged :-
 * 8) * https://gerrit.wikimedia.org/r/#/c/138655/
 * 9) * https://gerrit.wikimedia.org/r/#/c/141287/
 * 10) * https://gerrit.wikimedia.org/r/#/c/143004/
 * 1) Got Merged:-
 * 2) * https://gerrit.wikimedia.org/r/#/c/138655/ Hook change to core - changes to core finished.
 * 3) * https://gerrit.wikimedia.org/r/#/c/143004/ - SwiftMailer extension ( Completed )
 * 4) Working on :-
 * 5) * https://gerrit.wikimedia.org/r/#/c/144656/ - Added API to handle incoming hard email bounces automatically.
 * 6) * https://gerrit.wikimedia.org/r/#/c/145881/ - Added BounceHandlerSubmit Job to queue bounce emails to process.
 * 7) Articles I wrote on the progress:-
 * 8) * Writing a job queue to deal with load when post-ing from exim to Mediawiki API
 * 9) * Parsing email for relevant headers
 * 10) * Restricting POST requests to API using IP WhiteListing
 * 11) * POST-ing bounce email to a MediaWiki API directly from exim

Week 11-12 : July 14 - 27 2014 Week 13-14 : July 28 - Aug 10 2014 Week 15 : Aug 11 - Aug 17 2014
 * 1) Got Merged:-
 * 2) * https://gerrit.wikimedia.org/r/#/c/144656/ - Added API to handle incoming hard email bounces automatically.
 * 3) * https://gerrit.wikimedia.org/r/#/c/145881/ - Added BounceHandlerSubmit Job to queue bounce emails to process.
 * 4) * https://gerrit.wikimedia.org/r/#/c/146757/ - Added composer installed PEAR mimeDecode to decode email headers
 * 5) * https://gerrit.wikimedia.org/r/#/c/149324/ - Changed static function modules in the API class to another class
 * 6) * https://gerrit.wikimedia.org/r/#/c/149516/ - Changed the SwiftMailer repo location from Wiki-repo to defualt
 * 7) * https://gerrit.wikimedia.org/r/#/c/149648/ - BounceActions was made a separate class, since used in two classes
 * 8) * https://gerrit.wikimedia.org/r/#/c/148456/ - Notify administrators on API failing to parse bounce emails
 * 9) * https://gerrit.wikimedia.org/r/#/c/149912/ - Add documentation to class constructors
 * 1) Got merged :-
 * 2) * https://gerrit.wikimedia.org/r/#/c/141287/ - Removed exim errors_to to support custom Return-Path
 * 3) * https://gerrit.wikimedia.org/r/#/c/151428/ - Shifted from PEAR mimeParse to Plancake mailparse
 * 4) * https://gerrit.wikimedia.org/r/#/c/151442/ - Moved extension classes to inlcudes/
 * 5) * https://gerrit.wikimedia.org/r/#/c/151635/ - Moved the header extraction functions to separate class
 * 6) * https://gerrit.wikimedia.org/r/#/c/151867/ - Added PHPunit tests for BounceHandler extesnsion
 * 7) * https://gerrit.wikimedia.org/r/#/c/152129/ - Splitted the bounce headers extraction function, so as to ease testing
 * 8) Still to get merged :-
 * 9) https://gerrit.wikimedia.org/r/#/c/152164/ - Added test to bouncehandler extension to check regex functions
 * 1) Got merged :-
 * 2) * https://gerrit.wikimedia.org/r/#/c/152164/ - Added test to bouncehandler extension to check regex functions
 * 3) * https://gerrit.wikimedia.org/r/#/c/152813/ - Shifted Generate VERP address function to its own new class
 * 4) * https://gerrit.wikimedia.org/r/152793 - Test to verify the encoding and decoding of a VERP addrss.
 * 5) * https://gerrit.wikimedia.org/r/152906 - Removed the IMAP support from BounceHandler extension
 * 6) * https://gerrit.wikimedia.org/r/152934 - An additional base64_encode( hash_hmac( prefix, true ) ) was added
 * 7) * https://gerrit.wikimedia.org/r/153041 - Fix regex fucntions after change I0f9b098982adf023347e3f1343f1e16ccec91df1
 * 8) * https://gerrit.wikimedia.org/r/153016 - Fixed various issues with the BounceHandler extension
 * 9) * https://gerrit.wikimedia.org/r/153403 - Generate VERP address for a single/array of recipient addresses
 * 10) * https://gerrit.wikimedia.org/r/153450 - Improved the VERP generation by cutting down the hmac hash
 * 11) * https://gerrit.wikimedia.org/r/153820 - Remove redundant spam check bypass acl_m2
 * 12) * https://gerrit.wikimedia.org/r/153871 - Improved the unsubscribe user method used by BounceHandler extension
 * 13) * https://gerrit.wikimedia.org/r/153795 - Added test to check that the unsubscribe function is working correctly
 * 14) * https://gerrit.wikimedia.org/r/154068 - Added support to CentralAuth
 * 15)  Still to get merged:
 * 16) * https://gerrit.wikimedia.org/r/#/c/155753/ - Added the bouncehandler router to catch in all bounce emails