Wikipedia Education Program/Brainstorming for RFC for rewrite

=General organization=

An initial issue is the basic organization of the software that will provide similar functionality. Here's a possibility:


 * A component for visualizing streams of user edits.
 * A component for defining workflows, user groups and roles, via some sort of on-wiki schema.
 * A component that depends on the first two components and provides a workflow and UX tailored to needs of the Education Program.

=Possible synergies=

Here are some WMF and MW endeavours that may have some synergies with this work:
 * Flow
 * Analytics and metrics
 * Wikimetrics
 * Limn
 * Proposed Workflow extension
 * Wndialog Wikinews extension
 * For a redesigned UX, the Sidebar template?

=Justification for a rewrite=


 * The current codebase does not use, but should use, ContentHandler.
 * Experience with the current UX has led us to conclude that substantial changes are desireable.
 * The extension is geared specifically to the needs of the Education Program. However, other activities of Wikipedia and its sister projects have similar needs.
 * The class structure and architecture of the current codebase are not ideal.

To address any one of these issues would require rewriting a substantial portion of the codebase. To address all of them through modifications to the codebase would probably be more work than rewriting from scratch.

We should emphasize that we learned a lot from our experience with the current EP extension, so it's an important antecedent of this new work.

=General strategy for a rewrite=

Here's the approach we propose:


 * Create general components that form the basis of related features needed by the Education Program and similar endeavors.
 * Meet the specific needs of the Education Program through a thin layer of customization on top of those general components.
 * Set small goals for minimum viable products that meet the needs of the Education Program and contribute to a new system to be organized as described above.
 * As much as possible, replace parts of the current EP extension gradually with new products as they are completed, while continuing to use the parts of the extension that we don't have replacements for.

Since the current extension will remain in production for some time, we'll have to divide resources between its upkeep and the creation of new software. Work on the current extension should be limited to urgent bugfixes, minor improvements and urgent features that we can port to the new software. We should avoid tasks that involve major changes to the current codebase.

=Research towards a general component for courses, outreach and projects=

This is the component that would provide the basis for replacing most current functionality other than the feed of student edits.

It seems this component might model processes in general (including, but not limited to, relatively straightforward workflows), goals and tasks of various sorts, roles, and associations among all those things and among users, articles and other types of content.

If this is the case, then one type of software we could to look to for inspiration is business process software. There's a lot of work in this field, including modeling languages and development methodologies. Even though Wikipedia and sister projects are not businesses, some bits of organization theory developed for businesses may be relevant for social movements and volunteer organizations.

Here's a tentative initial reading (or skimming) list:


 * Davis, G. F. et al., eds. (2005) Social Movements and Organization Theory. Cambridge, U.K.: Cambridge University Press.
 * vom Brocke, J. and M. Rosemann, eds. (2010) Handbook on Business Process Management 1: Introduction, Methods, and Information Systems. Berlin: Springer.
 * Becker, J. et al., eds. (2003) Process Management: A Guide for the Design of Business Processes. Berlin: Springer.
 * ter Hofstede, A. H. M. et al., eds. (2010) Modern Business Process Automation: YAWL and its Support Environment. Berlin: Springer.
 * Jeston, J. and J. Nelis (2006) Business Process Management: Practical Guidelines to Successful Implementations. Amsterdam: Elsevier.
 * Weske, M. (2007) Mathias Business Process Management: Concepts, Languages, Architectures. Berlin: Springer.
 * Smith, H. and P. Fingar (2003) Business Process Management: The Third Wave. Meghan-Kiffer Press.
 * Rummler, G. A. and A. P. Brache (2013) Improving Performance: How to Manage the White Space on the Organization Chart. Third Edition. San Francisco: Jossey -Bass.

=Notes on a visualization and edits feed component=

This is the component that would provide the basis for replacing the feed of student edits.

There are synergies with Wikimetrics, which also helps analyze the activities of cohorts of users. If we consider the possibility of analyses deeper than those available in the current EP extension, there are also synergies with Limn, which provides easy setup of data visualizations.

What might this component look like?


 * It might be, or eventually morph into, a general on-wiki visualization system with on-wiki configuration options for visualizing user edits, other user data, log entries, data about articles, data about files, or analyses of article content or of any other kind of content.
 * It might be able to overlay graphs of data about user activity (like bytes added, pages created, thanks given, posts to flow/talk pages, mentions of users) with text and UI elements (including summaries of edits or edit sessions, links to give thanks, buttons for inline diffs).
 * It could provide several ways of defining cohorts, such as CSV files, username entry via a GUI, or the courses, projects or events that users have been associated with.
 * It should provide ways of viewing user activity after a course, project or event has finished, and of comparing the results of such endeavors.
 * It could be configurable by manually editing structured definitions and via a GUI.
 * It should provide a means of displaying visualizations and feeds without exposing users to a smörgåsbord of configuration options (perhaps via templates?).
 * It should interface with Wikimetrics and other standard WMF tools.
 * It should carefully load-balance and queue processing so as not to overload the cluster.
 * It could provide ways of re-using and sharing visualization and processing definitions, including partial definitions (i.e., a definition for just one aspect of a visualization or data transformation).

Stuff to check out

 * Mike Bostock's proposal for encapsulating reusable charts
 * Rickshaw, a toolkit for interactive time series graphs
 * Vega, a declarative format for visualization designs