Wikimedia Developer Summit/2018/Advancing the Contributor Experience


Advancing the Contributor Experience

DevSummit event[edit]

  • Day & Time: Tuesday, 9:40 am ‚Äď 12:50 pm
  • Room: Tamalpais
  • Facilitator: Matanya
  • Notetaker(s): Roan, JanD

9:40 - 11:10 Part 1

[15min] Intro

  • This is not a complete list and we will ALL be disappointed¬†:)
  • Let's not get stuck about what we need to build specifically, but what we should be considering to build
  • Contributor experience: Contributor: Editor, Curator, Moderator
  • Not talking about: editor safety, MCR (Multi Content Revisions)

Strategy: "By 2030, Wikimedia will become the essential infrastructure of the ecosystem of free knowledge, and anyone who shares our vision will be able to join us."

Questions to answer:

  • List things you think that are part of this topic in the etherpad
  • Is it a next 1 year, a 3 year or a 15 year issue¬†?
  • What is the smallest step with the biggest return that we could take¬†?
  • Is there an overlap with anything else we are doing that we can use to reduce the cost¬†?

Every block we start with 5 minutes of context, then answering the questions for 5-15 minutes, then discuss (YES, i have no idea if that is going to work.)

Derk-Jan: Strategy is a bold statement. Things that came up when discussing this in Berlin were on-boarding people and making sure they have a path to becoming a successful contributor. Editing beyond the desktop, not just mobile but also futuristic things like speech assistance. How can we distribute the effort, get others involved in the system. Collaboration came up a lot. New types of content/projects. Last 15 mins, decide what the 5 most important things are to you that we need to get to in the next 1-3 years.

10:00 [15min] Designing for contributions / Before we build discussion

Pre-input: JanD: New Editor Experiences ( ), enabling better research opportunities (e.g. ). Let's have empirical, shared things to discuss along! (I think we have too few shared points of references (… like that for the academically inclined)

TheDJ: I want to get your input on what we're missing when we build technology. What do we as technology need to know before we can build something and where are we severely lacking. A lot of things I heard are "we want to build something, technology needs to deliver it, and tech feels they don't have the right background information to be successful reaching that goal".

CScott: There seems to be a gap between what we're calling "product" and the developer side. Sometimes these discussions come from a dev-oriented focus like "we need better support for video". These need to get resourced, which involve business cases etc which devs aren't that good at. One of the problems is bridging the gap between a good technology and a product with metrics that can be managed successfully.

Volker: From my experience, there has to be a get-together of PMs, devs and designers / deisign researchers in the very early stages, to fully understand what user needs are, then try to come up with solutions that aren't just tailored to loud voices. Important to broaden input early on. QA at the end to make things a smooth experience.

Research for changes


  • Shared sense missing! People wish for button X, but don't say what's behind it. One thing to get it is shared ways of communication (just saying you want X: will not have an effect)) The other thing is using research that is actually there. Many people seem to talk about "new editors" but few seem to refer to the study.
  • Research methods. Micro-surveys, qualitative not just quantitative. Jonathan Morgan, Abbey do work here, but we can do more to gather qualitative data
  • Methods of working with the community missing (one way: Participatory Design. Have shared oblications between staff and community to learn from each other)
  • Lets be more proactive! Design is not a linear process. Sometime a quick, preliminary solution can work well. It might be great to just put out a prototype first. It is hard to know the truth‚ĄĘ beforehand.

TheDJ: corresponds with soething someone said yesterday. Pau made a design for searching Commons metadata. Liberating for people to let go of what we have, see what we could do. Argue about a design, why it could work. Don't do enough exploratory design, figure out why people respond to certain things.

Mingli Yuan: Paper from Bret Victor from the MIT media lab, Media for thinking the unthinkable: You can interact with the graph to gather the meaning of the equation quickly. We have, a popular machine learning documentation website, they implement a similar interactive UI. This kind of thing is very interesting, adding an interactive feature into the editor. And also we have Vega which is a visualization grammer.

Leila: What I see is that the path from quantitive or highly qualitiative research to product development is not really there. In some cases transformative research [="quantitive research result drives product design" ] an be turned into a product, but in many cases it doesn't work. For many of these problems we need transformative research. [JanD [unspoken]: Interesting! Where does it get stuck?!]

TheDJ: I see that often. Problems with contributing and the social problems around it are so complex we can't go forward without that.

Aaron: There's a lot of evaluative reasearch. Once we already know what we want to do, we can tell if it's working. But transformative research [="quantitive research result drives product design" ] that Leila is talking about is needed. [JanD, unspoken: yes, research for evaluation has also a far longer HCI tradition or at least is more visible since a long time]

Adam B.: +1 to what Aaron said. Part of that is these people talking to each other, more frequent communication. Nirzar's design team is really interested in working on [???] problem, trying to understand transformative research [="quantitive research result drives product design" ]. Toby mentioned yesterday that mobile editing is going to drive a lot of Audiences' work. API-orientation is something we need to be thinking about. Highly likely that mobile editing is about interactive apps (native or web). On the quantitative side of things, we haven't quite figured out how we want to measure the end-to-end user journey. So when looking at progressive on-boarding, what's a privacy-sensitive way that we can look at the whole lifecycle? Third area is [???] and messaging. OS-level notifiations will be an important part of people's workflow, when they do things like discussions. Ultimately PMs will set the roadmap.

Tpt: When doing studies, don't only study large communities, but also smaller ones and sister projects. will make products better and more future-proof. Different communities need different things. [JanD, unspoken: Good point. It can be very difficult, needing translators etc. BUT a) that does not mean we should stop trying and b) we might be able to involve our communities better in supporting us. Particularly small community seem to be super willing to help and have a chat with a user researcher!]

DanA: Apologize for perceived lack of capacity from Analytics to answer certain questions. We've said "no you can't do that" because of privacy or we don't have the data, that's built the expectation that it's never going to be possible. Now more things are possible, but people have stopped asking questions. We have to work on getting more creative about what we ask, getting deeper into cross-project expeirence and end-to-end user experience.

TheDJ: Why don't we have a researcher / design researcher that analyzes what a talk page and discussion page is currently, before we even start (re)designing a discussion system. Why isn't there someone whose sole purpose is to research and document the current system, why it works. That can feed into improvement work. If we had that information, we'd be able to create much better tech.

Volker: We have moved in that direction at WMF. Also the duty of the devs to say "we are lacking this research, we need it to build a good solution". We shouldn't fall into the idea of we're going to build and not rely on research. Call it out. [Q: JanD, unspoken: Interesting! So you think that PMs don't ask/care (?) for research, but devs do?]

Leila: When you talk about design research for the type of problem you're looking at, you need a few researchers on top of that to look at the problem. Social systems need to be modeled in terms of how people arrive at consensus. [JanD, unspoken: I would love this research, on the other hand I think already some smaller attempts would help]

TheDJ: Can also outsource that. [JanD, unspoken: Indeed, on the other hand, we should also learn as an organization, so it is a trade-off]

JanD: Could word "transformative research" be changed/carified/defined? The use in the room does not seem to match what that term is defined in Wikipedia. [JanD unspoken: A previous version interpreted that I wanted to change the article according to our idea of it. That was not what I wanted to communciate] As for data, people often don't lack data, they lack sense. Doesn't have to be Truth(TM), but should be based on data, plausible, tracable and sharable: We should get better at creating sense.

Victor Zainan from Google: Wanna echo a lot of ideas including API-oriented design, allowing users to donate their private data for a research purpose. New ideas: we now have a lot of analytics data, let's make those more visible to end users. [janD: I think libreoffice, mozilla are allowing this]

TheDJ: Does that mean showing "did you know that 30% of users close the editor after --"

Victor Z:

  • +1 to transformative research needed, and make analytics more visible to allow for feedback to contributor and contributing community. Guts feeling and sense are buil through seeing a lot of data. For example, what percentage of users scroll through the entire page. It's a pipeline, some poeple become developers of the sense. To make that work, we need to throw that data on in the very beginning.
  • We need to recognize the contributors are in different stages, and they have different use cases. We need to design product or product features accordingly. Photoshop is not the same product for people that need MS Paint.
  • We need to have roles of Product Manager, User Experience, Researcher in the development community of Wikimedia Movement, they have differnet problem sets, and need different skills.
  • +1 to API orientation design.
    • +1 to allowing users to explicitly donate private data for research perspective 1` +1 (and to anonymizing suitable aggregate data)

Design hooks to enable smaller groups & subcommunities? and A/B testing of many ?

10:15 [30min] Onboarding, getting experience, learning the wiki way

  • Dealing with complexity
  • Learning
  • What can be done in technology, what is a social issue? [janD]

TheDJ: This came up quite a bit during strategy sessions. Right now the levels are you're a reader, an editor or an admin. Everything in between is very muddy, and even these three levels are not very distinguishable. We've been adding a lot more complexity, makes it difficult for people to learn things. Would like to discuss what we can do with tech to assist in that. What meaningful changes can we make here to help epople not be as confused from being a beginner to growing into another role. [JanD.: there is probalby quite some research on community role, dunno where it is but it is there :-) ]

DanA: I found in growing up with the web that as soon as sites added a home page for me, a user page, where I see notifications and everything that's going on in hte owrld, then I can venture out and do stuff. Helped me on StackOverflow, Facebook, everywhere that was a concept. We don't have a good version of that. When I came at that in a newcomer [...], you were talking about roles, a complex ontology of responsibility, newcomers are right to be scared. 100k people come in, most of them leave scared. A place to call our own, to interact, structured conversations with good abuse prevention mechanisms.

TheDJ: Is this a 1, 3 or 15 year problem?

DanA: If by 2030 we have proper user home pages I think that'd be a great success. So 15.

Aaron: Research [yet to be published :(] coming out of citizen science communities, paper about people who have weather meassurement devices at their homes. One of the key results was people who stuck around in these communities were already doing measurements and talking to their friends. So they were essentially already a commiunity member before they joined. A lot of the poeple who show up on WP and are successful are used to doing what that requires, they were sort of already Wikipedians. When we consider making WP look like the rest of the internet, we should consider that we want people to show up and feel like it's familiar, a place they're already part of. Because that's what works, not because we should look like anybody else for any other reason.

JanD: That goes very well together with existing research. ‚Äď Framework: (people get into community culture by making small risk free contributions that expose them to community practices) [JanD] Some of our tasks are automated away by bots, while they would be great for newcomers to get their feet wet with.

TheDJ: Low-risk small contribs are a good point. That's something we say on the help desk. Don't start writing with writing a new article, start with fixing one sentence in an article. We say that all that time, but people have the innate desire to start a new article about a person they know. That's what they came for, and that creates problems. How can we divert them to a direction that's less risky but still grabs their attention. that's an important question to answer in the shorter term. [Relates to new editors research: people have a mission Most starts on wiki are NOT the imagined "I slowly get into…" but people often have an idea of what they want to do. This is often not "glorious", a very active editor I know started with a track list for a music album.]

SJ: Supporting small groups who interact with each other. Right now everyone interacts with the same amorphous community. Finding ways to have enclaves that have fast interactions with each other but slower ones with the rest of the network. We see that between wikis, but also worth thinkin gabout within large wikis.

TheDJ: Sort of have that with wiki projects.

SJ: Looking at Wiki projects that have succeeded is a great example. A good wikiproject is a delight, the other members are there to help you interact with policy problems etc. But it takes a while to set up.

DanA: An important thing to talk about is that all tech is an abstraction on a common goal. Some poeple get right away that wikitext is a means to the end of making knowledge available, but some poeple don't. Identify how many people don't, and humanize it. My favorite social networking experience was when StackOverflow published reach. "Your questions have been seen by 1M poeple". Made me very happy. That kind of humanization is critical in our project, and there's not enough of it.

TheDJ: Let's try to get one thing that we could do in 1-2 years that would move us in this direction.
DanA: That ^^ we could do now, very quickly.

JanD: Getting microsurveys working with qualitative data could help us get a better understanding of a diverse community. The other thing is actually putting our assumptions together with traceability, citing our own studies on wiki. Making known where our asusmptions coem from.

Tim: Enable NewUserMessage. When someone creates an account, you can give them an automated message. Enabled on 50 WMF wikis, not on enwiki. Was just looking through the new user creation log, the only mesages new users are getting are from ClueBot. In 2002 when I created an account I got a message within 10 minutes from Mav149, assumed it was a bot but it was a person, was nice and a point of contact, supported my admin application the next year. It's the simplest thing and we're not even doing that.

James: The reason we don't turn New User Message on is because it's terrible. We have data saying new users don't understand it's a talk page, because they don't understand what talk pages are. So all they see is the system yelling at them with a bunch of links to policies. Users don't understand talk pages until their 3rd or 4th interactions. They just ignore new user messages.

Max: New users claim they didn't notice new messages even when we had the orange bar of doom. Maybe when people receive a new message we should show them a modal dialog telling them. [JanD:Could try, sounds like a usability question, not too difficult]


DanA: haven't seen data, but this is an example of something useful. What James says doesn't apply to Teahouse. The intuition that Teahouse is successful doesn't jive with newcomer welcomes not working. Doesn't mean the research is wrong, but should dive in to find out why it doesn't work. Iterating on New User Message to get it closer in feeling ot the Teahouse.
James [unspoken]: Well, yeah, Teahouse is a totally different thing, this is an apples to swords comparison. :-)
Dan [unspoken]: The similarity I see is in the chronologic experience of newcomers. They're both essentially welcome messages. Maybe the answer is that we need to turn New User Message into Teahouse, and that New User Message doesn't work at all, but my meta point was that when we have similar things with opposite outcomes we should drill down more.

Leila: Going back to what kind of thing we can do. If you look at the timespan of 3 years, there's this project abou t contributor diversity. 40% of my time. Goal is to undrestnad some of the potential causes for people not joining the project or leaving early. Specifically its impact on minorities in the project. Designing frameworks that help women stay longer, focusing specfically on the issue of confidence. If we can build early confidence, they'll stay a lot longer. Don't know if this applies to Wikpedia, but we need to test. These cycles of research and experimentation need to happen a lot faster. With the resources we have, we can't move faster.

Adam B: It seems people assume that if we give a hook to potnetial new editors, they'll join. But I wonder from a tech standpoint, does the existing tech stack to encourage our volunteer community to help curate that experience? Or does it make more sense for highly organized groups like WMF and WMDE to organize highly curated experiences?

New contributors editing

TheDJ: Just adding it will not cause more poeple to suddenly start editing. Everything has become so complex, can we help people find their way faster and better so they become more optimal contributors. So the amount and quality of their contributons is the mulitplier, rather than actally having more contributors. The complexity of what we've built and the scoial complexity holds back the contributors we already have. Instea dof finding more editors, maybe we need to fix their ability to contribute to the best of their ability. If you don't have to go through the experience of writing an article and having it deleted, that's a month wasted. [janD: "Everything has become so complex" What is that referring too? technical, social complexity?]]

JanD: New editors research found that people have a mission, and it usually isn't fixing spelling mistakes. That's a double bind. New editors says it's difficult, legimiate peripheral participation says it's necessary. There's a myth that poeple start with a phase of very small edits. There's a page where people write where they started. DerHexer started by putting in an album he liked [Hope I get it right, read here:]. More cases like that could help us get a better grip on how that beginning actually works for people.

Timo: Re NewUserMessage, one way that it can be improved is show it after you've created your account and before you enter the wiki, not on the talk page. Make it not too long and complicated. What I like about Dropbox is an animated introduction where you click Next twice and you're ready to go, and it tells you how to find it back. Instead of NewUserMessage we could have a GuidedTour every time something happens, saying "this happened, this is how it works". "Here's the messaging center". nlwiki has a mentoring program, uses NewUserMessage but then you can sign up to have a mentor assigned. That was useful, did that for a while, users ask questions, teahouse-style but on your own talk page. Tpt wrote something about microtasks, Wikihow does this too. Categorize microtasks and offer them to people. Provide a way to make edits without being on the edit page. Timo: Dutch Wikipedia has a mentorring program similar to what Tim describes, but linked from an automated new user message. (microtasks) could also be automated through something like WikiHow's Community Dashboard with dedicated workflows/interfaces for micro tasks that the user is both authorised to do and speculated to be capable (more type of tasks exposed with age). +1

TheDJ: You can personalize that to get to Dan's idea.

Timo: We've seen guided tours in VE, but still too much to grasp. There are so many things you need to understand for what you want to do.

CScott: Concrete suggestion for Teahouse, which is real-time collaboration. Would be a great thing to integrate into the teahouse experience. You have an experienced mentor who can sit with you and help you with yor first edit. Real-time collab would help a lot with on-boarding. Some pushback from experienced editors saying they're not going to be interested, but we're doing it right now. In a mentorship situation, it might be even more relevant. I've also talked a lot about fork and merge support. If a common new editor experience is getting stuff immediately reverted, then instead of that getting it pushed into a fork and working with a mentor to fix it so it can get merged, that would be a better social experience for them than feeling their contribution just got thrown away.

[JanD: [unspoken] The need for personal interaction is suported by] [JanD: [unspoken] I'd be careful with forks… a very software-y idea that might not transfer well to wiki]

  • [cscott] We already have the Draft namespace. We already have folks forking into their user space. It's just improving this support (esp merge tools!) and making it less ad-hoc.
    • SJ: Design so that the easiest way to make the first dozen edits, or your first new page, is in a way that doesn't trigger reversion/deletion. (imagine flagged revs that show up for you in your personal history, but are treated specially w.r.t. visibility, peer review) Don't make it easy for newcomers to edit in ways that are often contradictory. Any version of fork-merge would make this straightforward.

Zainan Victor: +1 to Teahouse, in addition to having a mentor, having a class of newcomers for you to discuss with. The class of peers who don't have to reach as high quality of skills, and have a class of peer you can talk to, people who suck as much as you, which will be a fun and strong bonding social experience, and also boost up their self-confidence.. Having a mentor still crashes your confidence because that person is too good.

Another issue is we mostly call contributors editors, we are requiring a too high bar. WP is an encyclopedia. If you think about the traditional way of editing an encyclopedia, there are stages. People who nominate topics, collect info, people who compose and start a draft, people who edit and revise. Biggest challenge not having enough contributors is because there is a high bar because we're requiring everyone to be able to do everything. In addition to CScott's git-fork-merge idea, I'd suggest having different stages in the collaboration of editing workflow. You can start a stub of an article but that's still a high bar, I'd be scared to create a stub and possibly have it deleted, it's depressing to a newcomer. A lot of times poeple have ideas that the existing contributor base could not even think about. We can have people nominate topics, do collection. You only need a desktop when you need to carefully fine-tune what you write. But just submitting knowledge like images doesn't. We could throw that into a bucket and put it in an article later or let whoever have good skills in drafting stage to work on them. For example, I'm bad at English grammar but I can contribute other topics or pictures or facts or references. Other people are good with wikitext. I'm good with programming, but infoboxes are scary. If we can have a staged workflow and let people with their specialized skill or resources to contribute, that'd make it easier for people to contribute, and grow our contributor base.

  • [cscott response to Victor] the fork/merge model would give new editors a longer more deliberate process to prepare/improve their first edits.
  • Niklas [after the fact]: The word editor in English has meanings besides "making changes". This is not the case in all languages, for example in Finnish.

Tpt [unspoken]: A system where local communities could provide to the new users microtasks to start. Not something based on ML or something, just a stack filled by the community +1 It could also be tasks on sister projects: Wikisource proofreading is much easier for on-boarding and wikitext learning than writing articles.

10:40 [30min] Editing beyond the desktop

Micro contributions[edit]

low-context quick actions suitable for e.g. mobile devices.

DJ: How to make micro-contributions, how to do editing on mobile to begin with, lots of ideas suggested, now voice systems are coming into the picture, directly quoting from Wikidata. We know we need to change a lot about how we contribute because writing a full article like we do on desktop is not going to scale to other areas. Microcontributions come up most often, for example, image suggestions, or queues where people flag offensive images, bad spots in articles, or like echoing Timo, just nominating something for deletion is a contribution. This currently requires lots of complex steps but they could be just nominations that are added to a queue and reviewed. Split/move subsets of these queues into discussion.

Aaron: just wanted to offer the observation, that studying wiki processes, becomes clear that flag/check/process, things that aren't writing an article from scratch are the vast majority of work that people are doing. Sometimes talking about microcontributions we feel like we're talking about new types of contributions to attract new users, but really it's the ice under the water of the iceberg, it's the majority of the work.

Jan D:

  1. [response to Aaron] The double bind I see is: It seems to be great to improve e.g. watchlist. BUT also people find it super important and have social practices, maybe even wrapped around a bad UI. Currently I think we lack the social structure and theories to tackle this.
  2. Second point, some of these things would work really well for Wikidata. BUT: Wikidata currently is hard to use on mobile, and I think we also lack easy to use ways for people writing new tools on top of WD.

Aaron: [direct response?] trying to figure out how to get a machine to help trim down a backlog of work, wonder if microcontributions fit into that, two things obvious corollary... too fast :) Microcontributions might be low quality and require secondary review. Maybe these folks are not experienced editors or can't see a bigger picture and apply the same judgement of people editing wikipedia. Wonder how we could apply ...

Ariel: Imagining something in the future. User pulls up an article and they're like "oh that's wrong" or "oh I know something there that's missing". Maybe you can click the button and it asks you what's wrong, you fill in a form and it's added to a queue. It's a quick way for someone to get started. We can ask what their sources are, the information would become more useful. Then people sorting through these contributions could verify and merge it into articles.

DJ: One thing we should consider is that the community has pushed back and said now we're getting so much information, so many new queues of work that it's overwhelming. So algorithmic and automated support to get through these queues is critical

[JanD, unspoken: Relates to too much data, too few sense. (I think it is a Weick-quote, but I can't find it)]

    • S: Design workflows and work queues that connect new users with one another. Let streams of contribution & contributors support/review themselves.

Darren: Would like to propose a step further, if you go to Google and search for stuff ,there's already some content there, you might not get that click-through. Does the feedback people give get back to us?

Ariel: Why shouldn't it? We can make sure it does.

Subbu: Dumb question. Since you're talking about microcontributions, is there actually a mapping of all the ways people can contribute on WP? It sounds like a lot of different kinds of poeple do on a wiki. Is there a mapping/classification?

DanA: No. We havea big long aspirational project for it.

TheDJ: Someone told me about a paper that had a long list for these things, written 4-5 yeras ago.

James: When we talk about wikis, and especially advanced ones like English, German, and Dutch WIkipedias, we haven't given them any tools at all. We've given them categories and templates, and they built stuff with that. Bad because users have to cope, but good because it's so freeform that people have been able to build their own workflows and deal with their issues. They can change it with one edit, immediately change their deletion workflow or new user welcoming process. People get lost, it's terrible for users, but it's great for the community at large because they can change their mind overnight. As soon as we build them a tool, the only way it can really scale is if we build that process into the wiki. We're not designing the process, we're building a workflow management tool from scratch. That's an epic amount of work. People say we'll just add a help bubble, but that locks us into the current process.

TheDJ: Think you're right, we have to deal with it. What's important for microcontributions in the long run is that it's not that opinionated. Phab project tags are everything and nothing adn that's very useful. They give you a lot of felxibility and at the same time they provide structure. That's important for microcontribs, should be queues and piles that go into other queues that can be generically consumed by an algorithm or bot or whatever ,and then you go and do actions. But the basic concept should be so limited that you don't get stuck with a specific solution to one problem, that you can add new solutions for the same problem. Even if we don't think that's a good idea, but to give the community that option, to figure out what the best deletion process is for enwiki, large wiki where not everyone knows each other, versus a smaller wiki that's understaffed.

[Subbu unspoken] I think James makes an important observation. I think what that points to is that if we are going to build something to assist users, it needs to be a platform / tooling that enables them to build and manage their own workflows vs building specific workflows for them .. i suppose that is what TheDJ is also saying?

Leila: +1 re having a mapping. Tough project. We never really thought carefully about how we can use microcontribs as signals for improving some of the aspects of the content we have. Some readers from Nigeria, what kind of information can we direct them to give us to improve articles. That's one type of microcontribs that's usually left out. 3 years ago we did some experiments with microcontribs, receiving edits from readers who weren't logged in. Needed 5 responses to make a decision [...]. This notion of one account, one edit, doesn't really scale.

TheDJ: Important to pick the right things, especially as we start out, otherwise you run into those problems. As you say, a microcontribution isn't editing one sentence in an article. It can also be a social action in the community, nominating someone for the next level of authorship, like extended confirmed or adminship. Much easier to start with some of those things than it is with building content.

SJ: What about inviting wiki projects to identify microcontributions within their project. Then notify the wikiproject when someone takes up one of those tasks. This also links newcomers getting started to a community interested in reviewing the first work they do *because they care about the content* not because they're patrolling for spam. Different context ---> different tone. +1

Adam: Glad Leila was talking about this, need a notion of quorum on microcontributions. What I want to discuss is the assistant metaphor. AFAICT we don't have the technical capabilities to support the assistant metaphor (Google Assistant, Cortana, Siri). That's a gap, we shold probably be thinking about what's our hosting architecture, how do we transform content or give editors the ability to mark up things so that it'll work with those technologies. Notion of collecting audio from users, there was a lot of interest in oral citations in hte movement strategy process. AFAICT we lack the capabilities to parse audio so that it will be machine-readable and discoverable. A bit blocked on that stuff right now. There are providers who have these capabilities, some open source projects. Needs to be discoverable so editors can find it and put it into articles.

TheDJ: /have been playing with Alexa stuff lately, find it interesting. Could we do microcontributions with a skill? Some of those things should be possible. What you said about recording audio and oral hisotry, I've looked into those things and a lot of it has to do with what we think information is. Also think that these are not problems that we necessarily need to solve ourselves, but how can we move the platform towards something so we can use this information from another party and incorporate it with our values and our concepts of what information is. How can we expose that info from a 3rd party that we trust and incorporate that into our corpus. Issues with privacy and embedding, what is a revision in an external system, but I think eventually it's going to scale better than needing to do voice to text on an MP3 file. Would like to talk about that in the collaboration discussion after the break. People are interested in experimenting with new forms of content and new forms of authorship. Seems like we can't do that in the next 30 years, which is a shame because it connects with our mission.

[SJ: Looking at the prompt -- Are we playing w/ voice / vr / assistant contribs on some platform? Are there weird hacks out in the wild? That sounds awesome; would test. ]

[cscott, unspoken] A bot which looks at edits to wiki X, finds parallel texts on wiki Y, and suggests an edit to wiki Y would be a reasonable source of micro-contributions. User input would be "yes, apply", "no, not relevant", or "don't apply, needs further editing" (which kicks it to a desktop editor to refine the translation).

[Tpt, unspoken] Could be efficient for patrolling: display or speak the diff and ask to tag it as good or bad. Could be also used to curate set of suggested new statements for Wikidata (e.g. from the Primary Sources tool).

11:10 [10min] Break

11:20 - 12:50 Part 2

11:20 [15min] Distributing the effort

  • of distribution of content
  • of types of content
  • building a web of knowledge, without doing everything ourselves.
  • ORES / machine learning

JanD: Subprojects on Wikidata. There seems to be museums and GLAM institutions that import data. Importing data on scientific documents. Those subcommunities might drag in people who are not primarily interested in the Wikimedia project, but see it as a thing that has their own GLAM/sientist interface to it, and reuse the data from that. Could be an interesting opportunity, already partly happening.

CScott: I went to an annotation conference last year. A lot of interest in these communities in annotating content. For citations we're already collaborating. Museum image data we could potentially reuse. The semantic web could be a valuable target for cross-reference outside of our project, have us be the canonical ground truth for semantic terms.

Mingli: The main result of ML we use, making them public to use by other people. We use WP data to make a classifier for content. Maybe if we have a similar service in WP, we can make that data public for others to use. Publishing the trained model for others to use.

  • Aaron: Generally that's a way of operating in the last 3 years, we publish models, training data, info about how we got those models. Same thing about embeddings [janD unspoken: What are embeddings?! Answer, see below ‚Üď], very recently we published the first version of a click stream embedding that shows relationships between pages in embedded space by tracking the way people click. Now builing embeddings from proximity of terms in searches. Working on producing not just one iteratoin of these but turning them into a productionized process.

JanD: Strategically, there should be shared maps of interested parties similar to this one:

  • [Aaron, unspoken]: Embeddings are hard to describe in non-mathematical terms. But essentially, it's a strategy for compressing information (signal) that is very useful for machine learning and other AI problems. For example, word2vec builds a signal vector based on the proximity of words in a corpus. This meaning-embeddling lets you do things like "king - man + woman" and arrive at "queen". We use these for topic detection in ORES.
  • [JanD, unspoken]

TheDJ: What's being talked about here and what has been expressed in the strategy session is very hard, I think it's almost unachievable with the structure we have as an organization. Yesterday someone suggested something interesting and achieveable: instead of distributing the wiki as a bittorrent distribution, making the exchange of articles between wikis a lot easier, push something to another wiki within the network of wikis, with all thes tandards of revisions and verificatoin of infomration, using that to distribute. Using a wider array of stakeholders to create and exchange content, I think that's vrey much achievable. Us creating standards about what a wiki article does, how you exchange informatoin in a wiki way, how you connect into th enetwork for knowl edge wikis.

[JanD, unspoken: Is there an example for something like this being tried? Is it like MS Windows OLE?]

Tpt: Agree, Wikidata is a good example. Let's say life science community wants to upload a billion items, we can't take that. Having a federated system where they can have their own wiki with this data, then move some of it over to Wikidata or have Wikidata link to it.

  • TheDJ: That's also my concern, what are small achieveable steps here that make sense in this distributed web of knowlegde, rather than saying distributing is good and we just have to figure it out.

CScott: In the 3rd party wikis session, one of the goals was a beautiful install that was already linked up to our wikis, using InstantCommons etc. The idea being that by not providing best practice defaults to MW installs, we're losing influence to guide these projects and how they structure their content. All the talk about global templates and global gadgets is related to this effort as well. Making good tools for other communities, making it easy for them to point to ours, we can put ourselves at the center of the spider web.

Mingli: Was curious why nobody mentioned blockchain. In China people ask about this all the time, why not here?

  • CScott: We had this conversation wrt IPFS as well. One issue is that the blockchain is immutable, while we have requirements around coverage of living persons where things need to be removed sometimes. If your content is immutable, that's a fundamental conflict. We nede to make stuff go away sometimes.
  • Tpt: Definition of consensus on WP is between contributors, deifnition of consensus in blockchain is owners of big computers. Probably not the same people. If we say there's consensus when enough computers agree on the version, you're giving an advantage to a small set of people.

11:35 [30min] Collaboration

TheDJ: One of the things that keeps being identified as hard to work together on. Lots of suggestions for improvements like more structured discussions to make them easier to use and more powerful, collaborative editing like etherpad/Google Docs and then save as a revision (has some social and copyright issues which makes it a bit difficult).


  • Many poeple come from cultures with oral history (Wikiculture currently being text based)
  • Problems of how to archive ‚Ķ [sorry, notes fell behind]
  • [JanD unspoken: TheDJ: Can you supplement?]

TheDJ: Five things that should include collaborative experience that we should do in the next 5 years.

Matanya: Wil focus on structured discussions. I see large potential in its role on the wikis. That it was only developed to 70% is a problem. If we could fix the most important issues: History, search, a watchlist, notification problems, [CScott: readable URLS!] Would be great.

TheDJ: Would this help us much in the next few years (over the current talk pages)?

Matanya: [???] reducing the frustration and higher levels of collaboration answer that question, I think.

CScott: Alternative suggestion: Embedding discussions in the page; that could be like comments (?). The talk page could be taking these discussion and put them on a part of the talk page and another part could still be free form [JanD unspoken: Screenshot?]

SJ: Collaborative spreadsheets. I spend more time this year working w/ others on spreadsheets than on wikis. [Now that we're talking about ways to make wikitext editing more aware of structured data, it's really time to solve sheets as well.]

  • [JanD unspoken¬†: Context?]
    • [^^ SJ: Google Sheets is the most useful single collaborative tool I use, and I hate that there's no free alternative. In addition to adding data to plaintext, w /wikitables, structured data, wikidata, slooooowly allowing data formats to be posted to Commons -- we should also consider a tool that's designed entirely for data, and allowing plaintext inside it.
    • @Jan, my use case: when working with a number of people to a) gather data b) gather citations c) organize lists, esp. ones suitable for a [wiki]table, or d) lay out details related to many parts of a whole¬†: a spreadsheet is the best simple collaboration tool. This is in addition to their original purpose, tabular data where you want to compute some cells based on the contents of others. They are highly customizable: when working collaboratively on anything from analyzing poetry to solving crosswords or visual puzzles.]
    • Ex: Current shared spreadsheets are all vulnerable to trolls and spam. They lack the thin layer of edit-review, diff-review, recentchanges and oversight that allow wikis to be so open.

Leila: structured discussions would be great for research: Harassment identification etc. Cornell & Jigsaw: we are trying to make sense of how to parse data out of discussion & usertalk pages... [ subbu unspoken qn: is there something that can be done from the wikitext parsing perspective to simplify this process of extracting information from a talk page? ] Research is not scalable this way. We need more structure on those pages!

Adam: A couple of things that need to happen before we can go down this path. We need to talk to people Our Community Liaisons and product managers are talking about this. It is largely a non-technical questions.

DJ: What if we would (mentally) throw away all the concepts around talk pages? Starting with the concept. I now would create forums. They are better for discussions than Talkpages. Talkpages are a black hole, too few participants. Discussions die here. Sometimes it is also helpful. A talkpage close to an article is a collaboration space. Sandboxes, metadata… if we consider this and would throw away functionality it could give us a better impression of what needs to be done. The talkpage was an accidental concept not well structured…

Aaron: Banning vandals paper People don't directly collaborate with each other. When I think about collaborations that scale well I think about indirect communication. Structured discussions work on personal conversations. But a lot of stuff is leaving tokens of information or tasks that others can build upon. This scaled better than direct communication collaboration

Roan: Not going to talk about structured discussions. One thing in the next 5 years: collaborate on the same document. There are lots of issues around multi-author'd revisions. Maybe some of the things that don't belong on a talk page can be in in-context comments. Google docs has suggested edits. [JanD unspoken: and all those are not new and since 20y in office applications]

CScott: +100 on collab editing. I've been talking about this for five years now, I think all issues people have raised have been solved or have some amelioration step, I don't think there's anything blocking us except resources. Important for Wikis as a living place for people. There are actual humans behind the wiki. Things like collab editing allow (opt-in) real-time views of the editing in progress, so as you look at the page you can see the people behind it, improving it.

[Tim unspoken]: In response to DJ "what would we do if we had a blank slate": comments attached to a particular point in an article. Like code review comments or Google Doc comments. Show them in VE

  • [James unspoken] ‚Äď Yeah, but we've talked about this for five years and haven't come up with a way to do it in wikitext neatly. Doing it in visual mode only would split the editing communities which would be bad (especially, a JS-only form of contribution is particularly problematic), but making wikitext unnavigable would be terrible too. Suggestions welcome.¬†:-(
    • [Jan unspoken]: Yeah, comment that marks some text seems to be tricky to do. Tried to implement it myself. On the other hand: A single point comment would work too, could parse to
  • CSA: ^ how to do it: then use those annotations in the "new wikitext editor" as well as Visual Editor -- plus, if you embed them in the talk page you can just ignore the whole "inline" aspect and create/edit comments based only on the talk page-based UX. (Selecting a particular anchor point for a new comment w/o using one of the editors is "fun" but still possible. Responding to an existing anchored comment should be completely straightforward to do from the talk page.)
  • subbu unspoken: the stumbling block here is the requirement of not dissociating comments on edits from where they were .. annotation based solutions cannot guarantee this .. so, this needs a product understanding that comments could move to places where they weren't
  • [cscott] you don't need any guarantee: look at how google docs (eg) handles it. You can have a comment on some text which has since been deleted, the comment is still there at an approximate location with an appropriate snippet of the context-as-it-was. Porting annotations forward is an orthogonal issue that (IMO) can be handled at the core infrastructure level; once that functionality is implemented, every other thing which could use annotations (the translate extension, etc) benefits.
  • subbu: but that is because that is associated with a DOM as far as i know - you are trying to infer position in a text document which is flat. [csa: i'd propose anchoring on the DOM, and using parsoid to translate to a wikitext location as needed. but you can anchor on raw wikitext as well, you just do character-based position updates. if you were anchored at position 6 and positions [3-8] were deleted, your anchor is *displayed* at 3 (or 8). but fundamentally the anchor is "location X in revision Y" and the mechanism to find an appropriate display position in the current revision is orthogonal.]
  • subbu: let us take it offline¬†:) but, tim below is proposing a different thing for text.
  • Tim: re non-js interface: can add links to create a comment to the diff view, just with per-line granularity. Finer granularity would only be offered in JS. Replies are easy enough in non-JS. [csa: +1. making an anchor in non-JS just has to be possible, not necessarily beautiful. You could also open a specific "select anchor" page which made every *word* into a clickable anchor point, if you wanted to go overboard on anchor granularity. to subbu's point, you could either record this anchor as a wikitext position or use parsoid to translate it into an equivalent DOM anchor before storing it.]
  • Tim: re carrying forward annotations across edits, this becomes conceptually easier with diffs. If a diff moves a word, you can move comments attached to that word. If annotated text is deleted, just delete the annotation. Like Google Docs, comments on deleted text are available in the list of all comments (talk page for us). [subbu]: move detection is the problem .. if your diff algo treats it as a delete + add, you lose your comments. [Tim] it becomes a problem of improving the diff algorithm, rather than improving comment attachment logic. [subbu]: understood, that is what i meant earlier about annotations / diff not being able to guarantee that comments are not lost/moved around .. so that is a product qn. if that is an acceptable loss of fidelity for this feature/product.
  • Tim: Does this have a task in phabricator? I see but that is generic. captures my generic problem statement about "stable ids" / "annotations" to attach information to a text format like wikitext.

Subbu: Point to a Q on etherpad in response to Leila's comment about Cornell & Jigsaw. In parsoid, we tried to expose structured information. Given that we have talk pages and all those old revisions that aren't going to be structured discussions. is there something that we can do with regards to the markupwe generate so that research and other projects don't have to parse wikitext to get info? [ unspoken: need an understanding of use cases ]

Dan A: Echo SJ. Spreadsheets are really useful for structured discussions --even in text heavy project. It speaks to that the whole world uses that. People need to be able to do things that are more complex than you can easily do on a talk page. Thinking about filter-bubbles and recent US election -- we're not really talking online properly. E.g. tried to get to consensus through a structured way with a UX that was structured for that problem. We can probably get further out of the box than forums. We're talking about 2030. Let's be more aspirational.

[?] like: …

[S: There is also an entire technical subcommunity who love spreadsheets and designing them, dating back decades. If we decided to tackle this problem we'd draw in some of those as new collaborators.]

Ariel: I can see that there might be a queue of people who are ready to hop on real time editing to help new editors (or anyone else running into trouble). I can also see people being very aggressive and actining in a negative way. We could provide a real person to help in real time. "Would you like to chat with someone online right now?" It would be a revolutionary thing.

Jan D: These are also social problems. People build things around the things that are there and culture flows from that. E.g. talk pages have a whole culture around it. It's beautiful that culture is not easy to change. We don't have a good way to move forward. "You've built a great culture around [talk pages], but how do we move forward?" How can we get to a mutual conclusion of moving forward technologically. This is not a fun thing. It feels like we've barged into someone's living room [and told them what to do]. hard to say, "We really need to renovate the room [innovate here] -- how do we do that together?"

TheDJ: Give those tools to people who really need it. Or apply it to areas where we aren't doing anything yet -- start from scratch. Then move to other places.

12:00 [30min] New types of content, new types of projects

  • How to give space for those who want to experiement with new types of content and new types of projects, without having it weigh down the rest of the operation
  • maps/graphs/video [JanD unspoken: tabular data]

TheDJ: This was not in the position paper but obvious in the strategy sessions: they have truble finding places for their contents. It has to do with oral history and how to share it with everyone. That is accessible to everyone that can be vetted as information… Wikis are very popular in very particular subsects (Lego, Star Trek whatever) There are a lot of things out there we do not do anything with. There is concern that launching new projects/products is so slow that it is next to impossible. Anything outside of Wikivoyage/Wikidata is basically impossible. What are the quick wins? I believe that it should be much easier to create a new wiki. There is currently no way to annotate maps. I need to get concent from Wikimedia commeete to deploy anything. Operation finds it difficult to get it started etc.

JanD: Two things: relatively easy to put content on commons. Seems to be a standart around doing this. This might be a way to build forward on this existing platform. One difficulty there is a cultural assumption that wikis works like WIkipedia and that it needs to be a wiki. But it might not be possible. E.g. Wikiversity adds educational content on top of a wiki. Why is there no learning management system? Wikis are not a golden Hammer that is helpful to hit things with. (

Tpt: I agree. We have a lot of sister projects: Wikiversity, Wikibooks, Wikisource... After more than 15 years of creation of wikitext, we still have images descriptions in wikitext. We should first move MediaWiki from an encycopedia platform to a content platform before supporting any new kind of content.

Jan D: Are there projects that are successful doing these hard thing (like sharing oral history)

Adam B: Heard a podcast about a school where kids interview their grandparents. [ subbu unspoken: i think i heard about it on npr too ] Adam B: The thing I was talking about is Storycorps

Volker: I don't, but I've been in a non-profit 2005-2009 that tried to collect audio and ‚Äúoral history‚ÄĚ on technological topics. Major problems were indexability and inner-content linkability. You have to have a large extra step of work to make that happen.

Jan D: A lot of what we do on the web is deeply engraned text-based. Other things ? In publications you have a list of Figures. You would never have a list of text there. Text is just standard. Maybe like E.g. male as a "standard gender" in cultural concept. Anything else needs is "the special case". Thinking about the deaf community: they seem to have a non-text culture. There are no widely spread transcript formats. Getting these things in the general text-based culture is super difficult, it is ingrained into the web, but also our general culture. It treats non text: images, designs, etc. as second-class citizens. (Which you already note as a designer working at Wikimedia if you want to show your design). See also:

[SJ: Video + audio, please! Short snips, short recordings. Plenty of widely used toolchains are just focused on these; we simply need to have a place that honors these contributions and features them -- for instance, handling transcoding regardless of submitted format so the stored format is a free codec ;) And working w/ other tech communities that are based in/focused on audio or video]

CScott: There was a session at Wikimania 2016 about beautiful article layout where active editors proposd some cool designs for articles that were focused on an image or video instead of on text. Eg:

Also, my candidate for "new media": Images! The way we lay out images in our articles is basic and clunky. E.g. we could better integrate images into the text with improved semantic data and in different formats. E.g. figuring out the lead image for mobile is crazy complicated. More:

TheDJ: If we can't do this for images, then we have already lost the game.

Ariel: Issue of moving from prototype to production. Iunderstand that with prototype we do not care about performance. If you want to get resources you need a few people at the Fundation. [?]

Mingli: I just want to add a commant on lead image in mobile. We have a problem: some picture of the people we should give the focus to the face.

12:15 [30 min] Everyone take <some> minutes to create your personal top 5 of what we need to take on first, based on what we discussed today

TheDJ: This is complex and we discussed a lot. Thinking about how to wrap up. Usually chaotic. Give everyone a little bit of time to create a list of 5 things that should be done in the next 5 years. Everyone create your own list. My five:

  • Build a queue and ticket/item system for micro-contributions, and first use it to select the page image/banner for Wikivoyage/images for mobile
  • Take the collaborative editing experience and roll it out to e.g. Wikinews where they need immediacy and collaboration, low risk area to try it out. Unavoidable over 15 years, will be the norm for any form of content
  • Annotations, very complex and lots of use cases, not sure what use case to start with; maybe for comments on the text for collaborative feedback instead of HTML comments shouting at each other.
    csa: there's also some nice work on semantic image annotations which could be a first-product candidate:
  • Do more research about what our process currently is (what makes a talk page work) before building a new solution

JanD: All non solution-y

  • Culture: Considering our cultural assumptions around Truth‚ĄĘ, Text as standard medium, Wikis as the standard tool for online collaboration ‚Üí reflective discourse
  • Social: getting a way to fade out/change/move things further with the community, even if it means change ‚Üí particiaptory design (?)
  • Management: Let's get better in creating shared understanding: Provide Links to sources, define terms, make graspable representations that people can discuss. We seem to prefer things that are vague rather than wrong in many cases (F. Brooks): Try to reduce the fear of being wrong in the organisation.


  • Workflow primitives. E.g. queues/repositories are useful for creating lists of tasks, revisions/pages that need review, etc. [JanD comment: Amir and me visited mozilla no such topics‚Ķ might be extended‚Ķ]
  • Structured discusssions. They are already pretty good but integration is bad. Let's get 'em integrated. [JanD: Intregration to what?]
  • Realtime help for newcomers. Click a button. Get some help and emotional support.
  • ML for supporting workflows. We're already doing this but there's going to be a lot more to do in the next 5 years. E.g. sock puppet detection, paid editor detection, etc.
  • Microcontribs for supporting workflows. We're not really doing this. Microcontribs needs some workflow primitives before they will be easy to develop. I think it can fit like ML does.


  • Doing our homework and widening our horizon: Whatever solutions we come up with, take into account our cultural and personal bias and make them as low-hurdle for users regardless of their background/language/physical abilities/younameit ‚Äď excelling current stack


  • Edits as a first-class citizen object in MW, so that we can attach conversations to them.
  • Inline annotations for discussion/concern about paragraphs/images
  • Real-time collaborative editing
  • Structured discussions
  • Workflow management, empowering communities with tools to run their own workflows (NOT just a top-down workflow they have to use)


  • Microcontribution workflow system working cross projects and UI in mobile, app and experiments with assistants
  • Keep moving MediaWiki from a wikitext platform to a content plateform.
  • Improve the structured discussion platform
  • Beggining of wiki federation with Wikibase
  • Move metdata/infobox/categories/... outside of Wikitext


  1. "Ticket system for staged actions", enables users to contribute in ways beyond just "edit" e.g. by proposing authorised actions. The execution of these staged actions are in turn possible micro contributions for administrators.
  2. "Community dashboard", to discover ways and areas to contribute. Leading into dedicated workflows (not full edit workflow).
  3. "Infobox", as first-class primitive across wikis. Allows presentation layer to extract and display differently. Insertable by VisualEditor without pre-knowledge of template names. [ subbu: but why limit it to just infobox? see my typed templates proposal ]
  4. Provide light weight way to edit meta data without full editor (e.g. adding categories). Possibly enabled by storing categories as JSON in MCR (with back-compat interface for unified wikitext).
  5. Live chat.


  1. Live chat. Even structured discussions suck if you just need help and might leave if you don't get it right now.
  2. Structured discussions.
  3. Workflows for the former.
  4. Continue making editing less frightening technically.

C. Scott:

  1. Real-Time Collaboration (eg, teahouse mentoring)
  2. Improved Fork/Merge tools (eg, no-revert-on-first-edit, better edit-conflict resolution, improved draft process, etc)
  3. Inline comments, integrated with existing talk pages
  4. Improved cross-language collaboration (eg talk pages with machine translation where possible), microcontribution of translated edits
  5. 3rd party wikis with "best practices" integration with WMF (global templates, wikidata semantic items, etc) to allow WMF to be the center of the free knowledge web
  6. (sorry!) queues to manage/merge offline/disconnected editors


  • Real-time collaboration, starting in a mentoring or wiki project context, with integrated chat


  1. Secure agreement with Google to provide feedback from embedded Wikipedia content in search results back to Wikipedia. This could be fed into the discussion page and experienced editors could help update the page with the suggested feedback.
  2. Add UI in Wikipedia app map section so people can edit directly from the app. Right now they have to go to the wiki page and try to make sense of wikitext including all kinds of templates.
  3. Implement a UI like how Google Maps asks yes/no questions about specific things, but geared toward knowledge gaps and conflict resolution in Wikipedia. Add this into the Wikipedia app with subtle and infrequent notification prompts soliciting user interaction.
  4. Build a method for any web page to embed a Wikipedia article snippet with intuitive UI enticing user feedback/contribution to improve this article. Just as people can embed a Flickr album or a YouTube video into a post on a forum, enable them to bring Wikipedia to their online "home field", while at the same time encouraging them to contribute.
  5. Using WikiBlame, allow users to thank for individual sections of wiki page. Use these metrics to show each user the power of their contributions. Also use this data to analyze the potential bias in each article section based on the diversity of contributors.

[random notes at bottom]

List of technologies:

List of products: Microcontributions

  • Select the page header
  • Report vandalism
  • Mix n match
  • Collaborative editor