Chemical Markup support for Wikimedia Commons/Internship Report

Eclipse Zend version 3.2.0: Breakpoint debugging MediaWiki MimeMagic using xdebug
Eclipse Zend version 3.2.0: Breakpoint debugging MediaWiki MimeMagic using xdebug

minimum viable product and goals[edit]

communication plan[edit]

  • daily IRC check-in and weekly hangouts with mentors
  • plan to do a lot of community involvement by pinging interested parties as soon as a minimum viable product is available in a test environment
    • this is to find issue but also to make a better product by gathering feature requsts

lessons learned since 21 April[edit]

  • issues are often more complex then they look like
  • while someone is doing code review, it's often good to have a means of real-time communication
  • evaluating different tools took a considerable amount of time
  • not so surprising as I am a longterm member:
    • there's a lot of setup work and helpful build services
    • the above doesn't always work
    • as well as deployment options like vagrant wmf-labs

setting up a working environment[edit]

Operating systemServerIDEperformancenotes
Vagrant/Ubuntu 12.x as a VBox guestLAMP- (you can presumably use vim) or just your preferred editor on the host system OK patience requiredeasy to set up; however the roles-thing didn't work as expected; no clue about whether remote-debugging works
openSUSE 12.x as a VBox guestLAMP- (using kate) OKeasy to set up if not even choosen LAMP during installation
Ubuntu 14.04 as a VBox guestLAMP- (using gedit) OKeasy to set up (apt-get, ubuntu software center etc.)
Windows 8.1 as a VBox guest(W)AMPPSPHPStormToo slow.easy to set up; PHPStorm as autoated spell checker, spots errors etc. and break-point debugging; however you have to limit HDD bandwidth if you are not on SSD and want to be able to use your host system -- I guess it's related to windows installer post installation optimization services
Windows 7EasyPHP (WAMP)Eclipse (pdt) OKeasy to set up; break-point debugging works fine; eclipse has several helpful plugIns [not fully tested all]; however image converts/scalers are not on-board on Windows: code style templates: php, js; Running PHP from CLI; CLI debugging; Important: Adjust the default char encoding. It should be UTF-8.

week 1[edit]

  • 00:48, 20 May 2014 (UTC) cloned paged tiff handler extension to hack it; gerrit:133069
  • 22:45, 21 May 2014 (UTC) MIME-type detection does not detect ....
  • 23:21, 23 May 2014 (UTC)
    • Eclipse + XDebug allowed me to conveniently debug MimeMagic.php
    • finfo_file is returning text/plain
      • There is no extension-hook for overwriting the MIME detected by the fileinfo module
      • Consequently it requires changes to the core
      • doGuessMimeType can be augmented (the proper way; specification doc available, however not required for a minimum viable product)
      • improveTypeFromExtension currently used
    • New challenge: Table image, field img_major_mime is of type enum, not including the non-standard chemical/* as suggested by the American Chemical Society (ACS)
      • Bawolff suggested in IRC augmenting the enum by the non-standard chemical. Let's see what "core-DB-people" tell me when I submit a patch to do so.
      • SQLite doesn't have something fancy like enum; running the local wiki sets correct values in the DB
      • But MSSQL support had some bugs: bugzilla:65757 that I need work around or fix first
  • 10:35, 27 May 2014 (UTC)
    • Installed MSSQL Server 2014 (important: the full-text index feature must be installed; otherwise installation of MediaWiki fails)
    • MSSQL: Learned about T-SQL and that stored procedures are no good means for updating because creating and running them may require other privilegues than the updating user has
    • Result: gerrit:135714

week 2[edit]

starting 2 June 2014

  • gerrit:135756 augments major MIME types by "chemical"
  • $wgMediaHandlers can be used to hook-up the extension. Obviously I have to create a class that inherits from the abstract MediaHandler class. Hence, some functions must be re-implemented.
    • What I want to find out is under which conditions and when Metadata extraction happens. Because it *should* happen *after* the intermediate SVG is created from the Chemical Table Files so I can properly build on the SVG scaling logic.

week 3[edit]

starting 9 June 2014

  • There are a couple of challanges:
    • Temporary files as stored by PHP after uploading do not have an extension. In certain scenarios, it was possible that rendering failed due to that because I didn't do MIME type detection by content.
    • MIME type detection by content for small files failed gerrit:138737
    • Bugs in the updater: It applies patches despite tables.sql has been updated in between.

week 4[edit]

starting 16 June 2014

  • Building a living example on just to have a proof-of-concept
  • Requesting new project on wikitech:New Project Request/MediaHandler tests for testing in larger-scale environment for the following goals:
    • Giving contributors the option to test MolHandler with reasonable response times
    • Measuring server load indigo-depict causes
    • Having detailed logs to spot errors
    • Having an environment that is similar to the WMF cluster (i.e. with simulation of dedicated "image scalers")

week 5[edit]

starting 23 June 2014

  • Submitting extension code produced so far to gerrit
    • gerrit:140732 — Foundation for the extension
      • Did simple performance measurement with Linux tools (that I am not used with) like /usr/bin/time -v and learned about what ulimit does and how.
      • Profiling with valgrind might be also interesting.
    • gerrit:141241 — Hooks intercepting with MimeHandler.php
  • Created instance in new project at labs (puppet fails on trusty so no auto-config, yet)
    • Security group for web access - note that after instance creation, security groups can't be changed for an instance
    • Proxy creation for web access - <-> instance:80
    • Installed extension ConfirmEdit and created some geeky questions to prevent spam bots and stupid spam users creating accounts
    • Several other extensions to support content, layout and markup
    • Imported and translated JavaScript for taking screenshots and uploading them
    • GuidedTour extension for getting started
    • Zillions of micro-commit to the labs-deploy repo
    • LocalSettings config: Allow everyone to create accounts.
    • Imported some content to play with.
    • Worikng on trouble reporter: Gadget that allows picking elements, drawing it on canvas, getting PNG from canvas, dumping HMTL of element and including both into an error report. Expected to be completed on Friday. Possibly a new extension could be derived from this work. It's pretty handy to have sth. that allows users to quickly report issues without having the hoops to re-type everything.
      • Not yet completed.

week 6[edit]

starting 30 June 2014

  • still working on trouble reporter: UI and information I wanted to gather implemented; it just needs to upload the file and append a pretty-formatted report to a page
  • Found something very neat: Port forwarding through SSH. This allows me to access a webserver's administration interface running on a different port without having it to expose to the public behind one of the WMF-Proxies.
  • SVGEdit's JavaScript is not state-of-the-art (leaving aside the TODOs and shortcomings like if(,4)==='SVGe'){ //because svg-edit is too longish). If it should be deployable to WMF, it would need a lot of efforts to prettify it.
  • Created a WMF-Labs instance for static files (svg-edit as of now) for security when it comes to editing SVG files - as cross-domain-scripting through iframes won't work. And for performance (almost cookie-free domain) running a Cherokee server which I found was easy to config.

week 7[edit]

starting 7 July 2014

week 8[edit]

starting 14 July 2014

week 9[edit]

starting 21 July 2014

week 10[edit]

starting 28 July 2014

week 11[edit]

starting 04 August 2014

week 12[edit]

starting 11 August 2014

week 13[edit]

starting 18 August 2014

week 14[edit]

starting 25 August 2014