Analytics/Visualization, Reporting & Applications/GlobalDevFeedback

Laundry list from of notes from my first full graph creation experience
Note: I was creating this graph: http://global-dev.wmflabs.org/graphs/ar_wp
 * Automatic sorting would be great (e.g., I was making a stacked line chart, and I want the smallest to go on the bottom, but I had to do it all by eye/hand for ~20 rows)
 * Coloring of lines: if you add too many fields into the graph, the color swatch does not reloop, it just starts making the lines/segments black! (note: if I clicked the color swatch, it would then apply a color, but if I did not, it would display as black)
 * automatic saving: yes - i learned the hard way that this wasn't like google docs :)
 * Confirmation of successful "save": it would also be great to get a little confirmation at the top if my saving attempts are successful
 * significant figures: can't find a way to change the sig figs that are being displayed (e.g., so, instead of "300.0" to just have "300," since editors are always a whole number)
 * Similarly, once the numbers get above 1000, they start to round ("1.2K"). Is there a way to make this optional, and/or eliminate it? For the smaller numbers our languages are working with, the exact number matters a lot (e.g., if the number of active editors is 1549 in July vs 1551 in August, which currently would just report as "1.5K" and "1.6K").

Overall, I'm really glad I have something to log feedback about - thank you for building this :) Jwild (talk) 21:05, 16 August 2012 (UTC)

Some future visualization suggestions

 * 1) MAPS! We need to be able to look at saturation of editors by geography
 * 2) Bar charts (Cluster, stacked)

Global Development Limn Use Cases
In order for Limn to replace current ad-hoc workflows for data analysis and visualization for Global Development projects, it needs to support the most common and important activities that those projects perform with data. In order to make sure Limn does what we need it to to work for us, rather than trying to make it do everything, we will attempt to create strong, representative use cases and actionable recommendations for the Limn team.

Please feel free to contribute the following:
 * additional use cases
 * additional examples (links, scenarios, or thumbnail images) for exiting use cases
 * edits to the existing use cases to improve their clarity, or make them more actionable
 * comments on the talk page

UC #1: Annotating visualizations
Currently, it is not possible to annotate a chart or graph directly in Limn. The ability to annotate charts is especially important when the charts are being used to communicate research findings to a broader audience: such as in a project status report or a presentation slide deck.
 * Examples
 * See this presentation on Arabic Wikipedia and this presentation on Portugese Wikipedia for an example of Limn visualizations that have been exported to PNG and annotated in a presentation software tool. Supporting direct annotation of Limn charts would make it easier for project teams to have presentations and discussions around their data without exporting it from Limn.
 * Recommendations
 * Provide a palette of basic chart annotations such as callout (text) boxes, arrows, as well as brackets (i.e. {, [), highlight boxes and highlight shading for emphasis. Additional options may be added as the need arises.
 * Comments

UC #2: Grouping related data
Currently, it is not possible to visually 'group' similar data. Since Limn will frequently be used to compare/contrast cohorts of editors under different conditions (different edit counts, different wikis, etc), allowing users to present different cohorts are related or different would be useful.
 * Examples
 * I want to visualize the number of scholarships (broken out into partial and full scholarships) across WMF, WMFUK and WMFR. I would like to be able to easily compare the relative proportion of full scholarships, and the total amount of money disbursed across the three different chapters.
 * Recommendations
 * Limn should support visualization of categorical data with stacked and/or clustered bar charts.
 * Comments

UC #3: Including rich contextual metadata and links to related resources
Limn currently supports creation of one caption per visualizations. While this is a very useful feature, it is often desirable to make more textual information available to aid interpretation of a visualization by readers. Two types of information that can be especially useful for interpreting visualizations are contextual metadata (information about individual variables, axes or sample categories) and external links. In the "Active Editors by Global North/South" chart on the Global Dev Dashboard, it would be useful to be able easily find out which countries are included in the Global South sample. In this case, a short description of the sampling criteria could be made available as a tooltip when the user hovered over the words "Global South" in the key (contextual metadata), and a link to a wiki-page which contained a full list of global south countries could be provided in the graph's caption.
 * Examples
 * Recommendations
 * In order to avoid cluttering the interface, provide a mechanism for adding short descriptive metadata in way that is only visible when a user wants to see it. Tooltips are a common technique for providing such contextual information, but other techniques (accordion menus, lightboxes) may also be used.
 * To allow Limn visualizations to be connected with other information resources (such as study documentation pages), allow users to easily add hyperlinks to the chart caption field.
 * Comments

UC #4: Saving a version of a report card visualization
The ability to create Report Cards is one of the most powerful features of Limn. However, these data change over time. If the report card is set to provide a moving 'window' on a continuous dataset (such as edits over time), earlier data might even be pushed off the visualization completely as time marches on. What if a project members want to capture the state of a visualization in a report card at particular point in time in order to refer to it later? Currently, it seems as though exporting that visualization to an image format is the best way to accomplish this. But that renders the visualization static and severs its link to the source data, making it less useful as an analytic tool.
 * Examples
 * Many Global Development projects generate benchmark visualizations: graphs of a particular dataset at a particular time (like the beginning of the project, or for inclusion in an incremental status report). In such cases, it is useful to be able to maintain a stable, interactive visualization of that dataset separate from the regularly-updated version that exists within the report card.
 * Recommendations
 * Limn already appears to have some built-in version control mechanisms, and allows for the creation of one-off visualizations. Building a user-facing feature into the Report Card interface that allowed a visualization to be frozen and made available under a separate, unique URL along with its underlying data file would facilitate the use of Limn visualizations in maintaining a rich, interactive record of project benchmarks that could be re-visited and explored at a later date.
 * Comments

UC #5: Discussing an individual visualization
Currently, it is not possible to comment on a particular visualization in Limn. If a researcher wants to capture feedback or wider input on the data, they can export the visualization and upload it to another source (such as a wiki page). However, discussions around data can be less productive if the data isn't readily at hand. Therefore, it could be useful to provide a mechanism to allow users to comment on a visualization within Limn itself.
 * Examples
 * The comment thread on Wikimedia blog posts (like this one!) allows viewers to leave comments while looking at source data. Allowing similar threaded discussion around specific visualizations (in a report card or a benchmark chart) could lead to more focused discussions, grounded in empirical data rather than opinion.
 * A member of E3 is looking at a visualization of edit sessions by Teahouse visitors, generated by a member of a Global Dev project. They want to know if the Global Dev researchers used the same edit session criteria that they themselves do, to facilitate comparison with their own findings. An embedded commenting system would make it easier to track questions like this, and their answers.
 * Recommendations
 * More discussion about the purpose of Limn visualizations is needed before the value of this use case can be assessed. Some of the obvious trade-offs and dependencies are:
 * should anyone be allowed to comment?
 * Should comments only be allowed on 'frozen' data samples (to avoid losing the context of the comment when a visualization is updated).

should anyone be able to comment?
 * how well will this scale to a large number of comments?


 * Comments