User:Sohom Datta/The Final Report

Background
The  syntax is a custom syntax used by the ProofreadPage extension (deployed at Wikisource) to document page number metadata for individual books. The syntax uses a combination of ranges, labels and individual page number assignments to create a compact representation of each and every page number of every page of a book (which may be over a few hundred pages long). For example, a 200 page book with 1 cover page and 10 blank/non-numbered pages (the title, pictures and the publisher pages) at the start can be represented using this efficient one liner:. Pretty neat!

Around the end of 2019, the Wikisource community participated in the Community Wishlist Survey 2020 and voted to replace the old interface with a new editing widget that wouldn't require the use of external software by the user. The task received 51 votes supporting it's implementation and was ranked 6th among all other wishes which is how it became a project at Google Summer of Code 2020 :)

Identifying problems
The pagelist syntax, however, posed a problem. While it was efficient, it required a significant amount of mental calculation to get the pagelist to display exactly what the editor wanted it to display. This was especially pronounced when a book contained complex page numbering system (for example say a book where the numbering restarts after every chapter or one with interleaved images).

Additionally, the old interface had no way by which an editor could look at the individual pages of the book (something that is essential for looking up the page numbering). This meant the user had to download the book separately and go back and forth between the browser and downloaded book quite a few times to verify if the numbering was as required.

Coming up with solutions - Image display
From the start, it was pretty clear how the second problem would be solved. We clearly needed some way to display the image of any/all of the pages alongside the pagelist itself. However, since displaying 300 or so images all in one screen would sort of defeat the purpose of the images in the first place (the images would be 30x30px which would make seeing the page number incredibly difficult) we went for displaying one image at a time in a separate pane side by side the pagelist input field. This left us with one decision to make, how would we allow the user to choose the page to be displayed. Commons allows the user to go a different page by using a dropdown menu containing a list of all the page numbers. Internet Archive, on the other hand, has a sliding slider that determines which page number the user wants to go to. However, neither of those two really fit the brief. We needed a way to tell the user explicitly what scan number they were on and what page number had currently been set for the page. We thus settled on parsing the pagelist syntax’s HTML output and creating a representation of the HTML output using buttons.

Coming up with solutions - Visual mode
Coming up with a proper visual mode, turned out to be tougher. For one, we only had space to display one image at a time, this meant that we had to somehow allow the user to enter a range from the current page to some other page without knowing anything about where the other/ending page was. Now, looking at the pagelist syntax, we realised that there were slightly different ways of representing the same pagelist. For example, a book with 10 non-numbering pages at the start with three images at page number 3, 4 and 5 could be represented using   as well as. This essentially meant that we could ask the user to mark the places where the numbering changed and what the changed numbering was. Once we get those inputs, we would magically be able to combine everything together to create a pagelist without having to ask the user about ranges that the user wouldn’t know about. Hooray!

Code Commits

 * Added visual mode to Wikisource Pagelist Widget
 * Fixed unsynced top-panel in PagelistWidget
 * Build the wikitext mode for the PagelistInputWidget
 * Fixed alignment of buttons in pagelist preview
 * Add a dialog to PagelistInputWidget
 * Creating a pagelist input widget
 * Selenium: Added preliminary tests to check if Index page loads properly
 * Wrap pagelists in a tag to aid identification

Further information

 * Original task detailing the basic requirements
 * Google Summer of Code proposal for the project
 * Tracking task relating to the code-review and building the widget