Wikimedia Developer Summit/2017/Annotations

=SESSION OVERVIEW=


 * Title: Annotations
 * Day & Time:Tuesday 2:30 - 3:40
 * Room: Chapel Hill
 * Phabricator Task Link: https://phabricator.wikimedia.org/T151958
 * Facilitator(s): C. Scott
 * Note-Taker(s): Matt, Nick
 * Remote Moderator: &lt;no remote feed&gt;
 * Advocate: Anders

SESSION SUMMARY

 * Purpose: Discuss problems and solutions for annotations of wiki content
 * Summary: There are many use cases for annotations. We should use Open Annotations as a standard, and start with multimedia as a use case.
 * Agenda:
 * 1) Discuss use cases
 * 2) Overview hyptothes.is and MCR contributions
 * 3) Discuss open questions (API, granularity)
 * 4) Priorities &amp; next steps


 * Style: Education/Consensus

=Discussion=


 * A number of cases where we have additional information to associate with main document
 * In some cases, we already have it and want to pull it out. In others we want new stuff without "polluting" the document with information not relevant to all users/editors.

Distinguishing between "markup" and "annotations" in our projects -- standard wikitext markup is intended to be authored by the same person editing the main text. Annotations typically involve a workflow distinct from document editing. In the "translation" case, for example, the document authors/editors may not even be fluent in the translated languages. In other cases, the markup is maintained by other specialized teams. There is some grey area, of course; eg. are regions annotated in an image part of the main document, or a separate workflow?

Use cases
We brainstormed a bunch of different possible use cases for an annotation facility. Each is tagged with some member of the community active in that project who might serve as contact for future work.


 * 1) Translate extension (translation regions) - @Lokal_Profil, @cscott
 * 2) * Explicit markup currently in wikitext
 * 3) ContentTranslation - @cscott et. al
 * 4) * Currently for creating new articles in language X if there is one in Y. But correspondence is lost after article creation.
 * 5) * Annotations would be used to maintaining correspondence between each paragraph in old and new.
 * 6) * Long term plan for CX is to allow you to keep translations in sync. Need an annotation to say what's associated with what.
 * 7) WikiSpeech - Annotate specific parts of an article with pronunciations.  Word pronunciations. - @Lokal_Profil, @Sebastian_Berlin_(WMSE)
 * 8) VisualEditor/Flow - Inline discussion.  Region should persist. @ESanders @Mattflaschen (not on Collaboration team roadmap)
 * 9) References - region -> citation data in Wikidata/wikibase.  Also for "citation needed". @Dario? (wikicite) @Tarrow
 * 10) Non-Wikimedia.  It's useful for text analysis.  Think wikisource, with a document and then commentary on it.
 * 11) * Similar to hypothes.is
 * 12) Data-curators. E.g. wikipathways.org has diagrams of scientific discoveries. We have technical experts, who come and approve a certain version. Some people only want to see content that was in an approved version. @Anders
 * 13) * Something like fine-grained "flagged revisions".
 * 14) * Could be done as an annotation containing the "approved" content. A fuzzy match on that annotation later could trigger UI letting you know the content hasn't (yet) been approved, along with UX letting you switch between current and approved version (pulled from annotation data).
 * 15) Wikisource
 * 16) * for example, you can still add annotations as a first pass [?]
 * 17) Figure references -   @Marktraceur
 * 18) * Associate an article figure with a *range* of text which refers to it, not just a point
 * 19) * Gives the renderer more freedom to decide where best to place the image
 * 20) Edit conflicts / proposed edits - Certainly store the entire change I want to do. @James_F? @DChan
 * 21) Language converter - @DChan
 * 22) * Store individual exceptions to the language converter glossary in use (similar to wikispeech use for pronunciation exceptions)
 * 23) FileAnnotations - images, audio, video. @Marktraceur, @Prtksxna, @Brion
 * 24) * Hypothes.is framework already considers annotation of multiple content types, including media types
 * 25) * You can already box parts of an image with annotations on WMF projects, done through a gadget. Would be nice to do this canonically.
 * 26) * Same for video (e.g. period of time)
 * 27) * Render in a cool way
 * 28) Maintenance tags/templates - Issue [page/section/inline] tags. @????
 * 29) * Allow tagging regions of page (ie, ) not just a point.
 * 30) Wikisource -- proofreading, and marking non-linear transcribed regions in the presence of inline advertisements, column jumps, etc
 * 31) Presenational annotations, for instance pull quotes. (Figure references as well.)
 * 32) Fine-grained flagged revisions

Implementation
* Which backend to use. There is a hypothes.is backend, but too complex to maintain.

** Daniel said it could be done with multi-content revisions.

** Backend part of this can be solved. Maintainers needed.

* Can you summarize the concerns with the Hypothes.is backend?

** Not production-grade yet. Not as nicely factored as I'd like.

** Talked to Sj about Idea of annotations, three parts of https://hypothes.is:

*** 1. Frontend Go to any arbitrary website, write annotations. Fetches annotations from backend and does actual matching. DOM tree and UX muddled together in the code base.

*** 2. OpenAnnotation spec. Can do strict and fuzzy matches. Really liked this (Code base). http://www.openannotation.org/

*** 3. Backend - Part which looked competent, but which I didn't want to maintain

** Dario: Main reason to ask is that I'm friends with Hypothes.is founder. Their approach is agnostic to what part of the stack you want to cherry-pick. They're building a coalition mostly based on second layer, with committment to do something for the other layers. They are very interested in seeing if Wikimedia can become a partner in adopting some version of the bottom tiers. See https://hypothes.is/annotating-all-knowledge/

** C. Scott - Would be good to be interoperable with standard.

** Trevor - Arbitrary DOM range does seem good for some of these lists, but not all.

** C. Scott - Main one currently used is XPath plus text to be highlighted and context.

** C. Scott - I think the fuzzy matching is the part that's not in annotator.js

** C. Scott - Translate has been doing this for a while (surfacing that it's a fuzzy match and you need to look at this). What sort of common features might be pull out to handle these? A fuzzy match is one piece of commonality. How many different sorts of annotation regions are there?

* C. Scott - This is where my certainty ends. Questions:

* API:

** Prompt - The fuzziness is resolved immediately when the page is edited, so the client gets the pre-resolved answer. You find immediately when there's fuzziness. It exposes to core the idea of updating everything.

** Delayed - We give fuzz in the API read response

[Not mentioned in discussion: https://hypothes.is/blog/fuzzy-anchoring/ - strategy for maintaining anchors in the face of unpredictable edits. This tries to be robust in the face of more adversarial conditions than we would experience, e.g. the old revision may no longer be accessible at all.]

* Trevor - In VE there is a concept of translating a range. There is a range before and after a transaction. You could store enough meta information with a revision to answer that question. Given an arbitrary range, after this, what is the range now? In VE, you don't have to worry about fuzz, since you know exactly what happened.

** C. Scott - There are always edge cases, e.g. deleting an entire paragraph.

* C. Scott - Hypothes.is maps URLs to annotations. If I wanted to interoperate with HP, I wanted to show all annotations properly associated with their revision.

* Trevor - If it's currently implemented in the client, it would fail if you have radically different renderings. It imposes a certain rigidity in the rendering of the content. If there's a way to have it annotate on Parsoid, that would solve it.

* Matt - Yes, there are a lot of transformations before using Parsoid for view. Flow has the same thing.

* Trevor - The view must be stable for this to work.

* Matt - Preferably, you can see annotations in both VE and view mode.

* C. Scott - Translate and ContentTranslation work at the stored level. For VE, if you want to render in preview without VE, it's a problem.

* Tom - You could use fuzzy text matching to map between different views (e.g. Parsoid, PDF) with context text, ignoring XPath.

* C. Scott - References are probably hard. If the source rendering is torn to shreds, references might be moved. Proposed edits apply to the platonic version. Proposal from LanguageConverter team that you should be able to do switches on the fly.

* C. Scott - Annotations on files should be fine, right? Maybe not, MobileFrontend changes images too.

* Trevor - Indexing every character with ID is one extreme. Fuzzy matching is the other extreme. What comes to mind is real-time editing. After interviewing a lot of people that worked on RTE and evaluating their performance, everyone obsesses about conflict resolution, but it comes up in 0.1%.

* C. Scott - For online things like Google Docs when you have low latency, conflict resolution comes up more.

* Trevor - Even then, it happens less than you think. The clever thing is that fuzzy solves most cases pretty easily.

* C. Scott - The interesting thing is related to how often you sync up. These might not be re-translated for 100 edits, so it might be problematic for fuzzy.

* C. Scott - WikiSpeech should be for uncommon words.

* André - Tried to figure out minimum. We needed both range and its context.

* C. Scott - Prompt or Delayed? Which is better?

** Straw poll: Prompt:5, Delayed: 3

* Trevor - May have to do both

* David Chan - Wouldn't assume this has to be slow.

* Trevor - I was drawing a connection between this and real-time. One of the reasons you can be a little lax is that your cursor will show the wrong thing. Visibility here might be less.

* C. Scott - My PhD advisor drew a lot of flak by claiming most failures don't matter, e.g. air traffic control.

* Matt - If you capture a lot of context, you will still have failures, but you will realize it's a failure.

* C. Scott - We try to make our failures big and obvious so they get fixed.

* Trevor this is assuming you can detect when it's failed.

* C. Scott - Translate I'm not as worried about.

* David Chan - Not necessarily so obvious, if you just pull a random sentence out of an article, it's not that obvious. Worst translation errors are where you lose the meaning

* Matt - Important thing is that you don't associate it with the wrong place. It's okay to fail or associate it with the overall section.

* Trevor - It still has its own level of fuzziness that can be fooled by moving things around.

* Mark - I think we're right that we need both. Look at Translate, you're not going to update the other language.

* Matt - This is mainly a DB storage issue, since we're doing the same transformations either way.

* C. Scott - It's also a question of what we expose to core (which implementation details). Database storage is delayed, but we have some sort of job queue.

* Trevor - One of the great advantages of having file annotations is that then when you're doing media search, you can match the search term against the paragraph the image is associated with. You'll need this data anyway.

* C. Scott - Imagine that you search on Abraham Lincoln. You find a match, but have to port the annotation forward before showing it to user.

* Matt - It's pretty big when you consider how many anchors there are (citations, citation needed, discussion anchors), plus context, even if you exclude the actual content.

* Trevor - concerns about pulling from multiple lcoations

* Scott - that's not quite how MCR work. ?

* Anders - Bottleneck is usually not that too much data needs to be stored. It's more commonly an issue of performance/scaling for requesting and displaying the data

* C. Scott - In my experience, core features never get adopted, we implement them for other things.

* Trevor - We've already seen that when we wanted image annotations, we stored a blob in the page. I agree that you relate core to a product-related initiative, but sometimes we take shortcuts instead. I think we need to be committed to an architectural move, then we can justify prioritization, but insist on doing it properly.

* C. Scott - Straw poll: Excitement level, who would use this Tomorrow or 10 years later?

** Tomorrow - Mark (Month and a half)

** 1 year-ish - Daniel Kinzler (but he doesn't know it yet)

** 10 years later - Trevor, Matt

TheDJ - We kept these doors open to make sure we can do what we need to do. MCR is not a goal of the grant, but if it enables us to achieve the goal then it can be funded under that umbrella.

C. Scott - The grant was just for Commons, so escape hatch is to do it only for Commons.

Matt - Different schema on Commons is scary.

Chronology: [Capture the gist of who said what, in what order. A transcript isn't necessary, but it's useful to capture the important points made by speakers as they happen]

Action items:

* Multimedia annotations as first use case

* Anyone starting annotations should at lease use the Open Annotations JSON format, then we can port to a different backend later

* Next year we'll come back and port it to the framework for Multimedia. AFTER THE SESSION DON'T FORGET TO:

Copy relevant notes and add a concise summary of the session into a new wiki page with "YourSession" replaced with the name of your session: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/Your_Session (use the template in the link)

Add any useful action items into the Phabricator task

Add the Wiki and Etherpad link to 'Session notes' section (add if missing) on Phabricator task description and All Session Notes page on MediaWiki: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/All_Session_Notes