Talk:Architecture Summit 2014/Storage services

Session Storage Services -- https://www.mediawiki.org/wiki/Architecture_Summit_2014/Storage_services Proposed Agenda:

3 minute lightning talk API Versioning

20 minute discussion

3 minute lightning talk Storage Services

20 minute discussion

3 minute lightning talk DataStore

20 minute discussion

API Versioning
Slides: https://docs.google.com/presentation/d/1H6JYzTR2V7RibJzuEc60EItqylGMijnh1YNMa4kokIE/edit?usp=sharing API is broken, but partly because it is hard to break backwards compatibility. API Versioning can help. What are the criteria for deprecation of APIs What are the strategies for migrating to new APIs Sumana: How do similar applications deal with this? Daniel: it's case-by-case. TOR: waiting is not a good strategy. identify the key users of an API. the long tail will always lag until something is turned off anyway. Bryan: Netflix up-transforms their API calls to newer versions Gabriel: It can work in some cases, but not infinitely Yuri: Having an extensible API (hooks, extensions etc) also makes that difficult. [I assume the internal API for controlling that would also be super complex] TheDJ: Scripts and gadgets on the wikis are not being maintained. Haven't we had this problem/discussion before? Tim: feature request for mandatory API keys Yuri: passing api key in get request kills caching Sumana: gadgets were eaiser because there's a central repo, what about a similar solution Antoine: Thinks current API is a bit messy and would support a new version 2 change with better architecture. Would deprecate the existing and not focus on backwards compatability. We are starting to hit the time limit
 * There is no one best practice
 * Often dictated for deprecation
 * Community "vibe" is conservative in this area
 * can use a combination of mechanisms (versioning, feature flag), depending on situation
 * content api may be a rewrite but we want to reuse
 * Siebrand makes a comment about how this being a communication problem too (hey, we changed something)
 * How do we get GOOG to switch from XML to json?
 * Easier with large users, harder with scripters/botters
 * Just because you get a deprecated warning doesn't mean it will be respected
 * Mark made a joke about inserting sleep into deprecated API calls. :) (was he joking?) :-) Mark was only half joking :)
 * You can also rewrite api calls internally, change params, redirect etc
 * Use the api mailing list. Pick a time frame (1 year, etc)
 * The Google approach is to kill services (but it's not very nice)
 * There is more logging being added in this area
 * Yuri says there is some effort to make user-agent fields with contact info (required?)
 * Require email registration
 * Fixes problem of user-agent modification techical challenges
 * You can write varnish rules to fix that (or use HTTP headers)
 * How would you prevent someone else from using the key
 * Tim: API key identifies developer of gadget
 * Brad: What keeps a script kiddy from using the key of a popular gadget?
 * Yuri: Do we force people to register? Isn't that against some policies and practices?
 * ALL: we should continue on mailing list
 * Brad: Similar to OAUTH keys
 * Sumana: If you are going to make a policy change then you need to ensure trust and prevent misuse/abuse
 * Issues with copying gadgets from one wiki to another with API keys?
 * TheDJ: create less friction by combining change
 * Yuri: If extensions are in our git repos we can search for problem use cases and fix them / notify developers
 * When there are security fixes related to gadgets, who is doing the maintenance
 * Yuri: You can invest in support or solve it by moving forward on the server - it's a trade off
 * Yuri: Current api is very big. It would be a lot of code to get rid of.
 * Antoine: Freeze old api and only add new features to version 2
 * Brion: How does the "API" rely on the internal implemenation? Isn't it the implementation of the "API" that has this dependency?
 * Antoine: Would like to see official client libraries in multiple languages that encapsulate these details
 * Tyler: +1 for client library. build an SDK.  There are also tools which generate API clients in multiple languages based on an API spec
 * There are also API calls which mirror / duplicate functionality of the web interface, there is duplicated logic
 * Yuri: orthogonal issue, implementation detail
 * Brion: Why aren't internal and external API the same?
 * RobLa: Sometimes there are very impactful breakages with slow moving clients (eg Apple desktop dictionary). Are there lots of slow moving clients like this?
 * Brion: there are OIA feeds, some other custom sofware users
 * Gabriel: comment about duplication of logic between Web UI/Special Page/API code. Refactoring core code into service layer addresses this issue.

Data Store
http://mediawiki.org/wiki/DataStore slides: https://docs.google.com/presentation/d/176xp-1ccpikLy043ESp5jeFL2X0sChS43T6TP4XqU48/edit?usp=sharing There are many use cases for storing simple blobs of data Very simple get/put API, schemaless, backend agnostic with support for migration by lowest common denominator restrictions Sample code link: https://gerrit.wikimedia.org/r/79029 Brion: What about key namespacing (per user), is there anything enforced by this proposal Max: No, that's up to the developer, use common sense. There is also one default store. Question about BagOfStuff persistence with some discussion of how that works Doctrine also has a key/value store Max: That is a big external dependency, also NIH. (Core does not have a clear external dependency policy yet) Brion: Seems like a good idea to have a simple interface like this in core Tim: Prepared to accept this RFC and discuss implementation details on mailing list Max: Wanted to make it an RFC to get more feedback from developers
 * BagOfStuff could be cleared accidentally by a cache clear
 * BagOfStuff doesn't have a "clear" method

Storage service
slides: https://docs.google.com/presentation/d/1H6JYzTR2V7RibJzuEc60EItqylGMijnh1YNMa4kokIE/edit?usp=sharing Gabriel: Brion: Is this an internal service for mediawiki, with things like auth enforced at the application layer? Gabriel: Authorization should be as late as possible Tim: Surprised to see title strings in the keys instead of page ids Gabriel: Just an example for the slides Tim: Points out that things like page renames/moves were very expensive in early versions of mediawiki Gabriel: Actions like renames should be non-destructive, graph operations to allow for finding old versions. Just create 1 new node for an action Yuri: Is the mediawiki engine then the rendering engine for content that comes from this data store? Gabriel: Thats the approach which will be used for parsoid, there may be other use cases Brad: If you can see versions from past what deals with vandal actions? Gabriel: Needs to be implemented but has ideas of how to add support Mark: There is some overlap with Max's proposal Gabriel: The interface here is more complex to support parallel operations, and there is the external API use case TheDJ: Are you proposing a new core storage manchanism or an additonal service? Gabriel: For use in Parsoid, may be interesting for others. Tim: What's the status - is this integrated into parsoid, what is the general application use case Gabriel: Not merged yet but the work is being done. There are other general use cases, yes Brion: Explaining some additional use cases, doesn't seem to overlap with Max's use case Tim: Springle making noises about the size of the revision table (sharding etc) and links tables. Would this be good for that? Faidon: Springle seems open to Cassandra for these sorts of use cases Gabriel: Wants to focus on the PHP Interface first [ I think there was some side chatter about the REST interface itself, vs the storage system ] Antoine: Can't we save to Swift? We are getting to have a lot of service dependencies (swift, elasticsearch, ...) Gabriel: We need an interface abstraction and ues the best technology behind it that exists Ops can figure it out. :) Gabriel: We are picking things based on the features that they have (cassandra is not good as a cache) Bryan: The PHP service interface can be a more generic entry point to many backends and this wasn't as clear in the presentation Gabriel: Yes, it's generic, you can map a namespace/prefix to a backend for instance Owen (that's me!): You don't necessarily have to choose between these two interfaces Gabriel: Sure, and one can be an implementation for the other TOR: So when can we have this? Gabriel: Rashomon will be tested in production soon. There is still some implementation work to be done. (buckets, auth, revision store).  2 months? TOR: What about the testing with parsoid Gabriel: Should be testing next week Brion: There's a lot of things to like here, potentially for storage of all kinds of things Gabriel: There are 3 RFC's here.  Proposes to leave out Rashomon for now and focus on the PHP/REST interface first. Brion: The parallel request feature has been sorely missing, likes the interface for parallel REST Gabriel: Max's data store is definitely a possible "local" implementation Tim: Accept Max's RFC. Consider Rashomon when it comes up later. Accept PHP client interface. Brad: API RFC got good feedback, will continue to refine on mailing list and wiki
 * need revision store for html, json and wikitext for parsoid
 * Wants a storage backend abstraction, and to be able to share storage implementations
 * Scalability, Reliability of course
 * Wants to have a public content API (exposing this directly also affects some implementation details, like links, domain names etc)
 * Sample implentation in javascript https://github.com/gwicke/rashomon (cassandra)
 * Performance details in slides
 * Example of simple versioned API endpoints
 * Implementation supports compression (down to 16%-18% of original size for wikitext) and immutable writes for revisions
 * Can work as a more generic storage bucket and also support more specialised use cases (counters)
 * Has a PHP service which talks to it, and supports parallel/batching of curl requests
 * Rashomon is missing auth, and bucket creation, PHP interface is not yet implemented (?)
 * What are the next steps