Extension talk:Graph/Plans

From mediawiki.org

Replacement for piecharts[edit]

Hi. I've created a replacement for piecharts. It doesn't require any JS and it should accessible (as in a11y) so maybe even better then Vega graphs/charts. I've published my module on English and Polish Wikipedia. Feel free to copy it to other wikis.

As a side note: If you want to translate examples then ChatGPT should be able to help... or at least it worked well for PL→EN translation (very few changes were needed).

See also more info and examples on the Village pump (miscellaneous).

Cheers, Nux (talk) 17:12, 6 January 2024 (UTC)Reply

Awesome work! I've used a different pie chart for now and yours looks way better. Is it possible to make it a donut chart as well? Rasputin 93 (talk) 17:20, 6 January 2024 (UTC)Reply
Thanks. Not sure if I would be able to do donuts in a stable way with CSS (would be much easier with SVG, but that is sadly not possible on Mediawiki). Any example on where would you need a donut instead of pie? Nux (talk) 23:05, 6 January 2024 (UTC)Reply
I currently use this one:
https://rammwiki.net/wiki/Module:Graph
https://rammwiki.net/wiki/Template:Graph:PieChart
I just prefer the donut look for such when using them on amount of songs played within a setlist, how many from each release.
Like here: https://rammwiki.net/wiki/05.08.2023_(concert)#Setlist
Also, is there a way to show only values without percentages? Rasputin 93 (talk) 04:06, 7 January 2024 (UTC)Reply
The automatic scaling is great. A promising sign for piechart supremacy. Thanks Nux :) Sj (talk) 17:17, 24 January 2024 (UTC)Reply

OSM Location map using CSS[edit]

I was inspired by @Nux' use of CSS graphics to see if that could be a solution for OSM Location map. (A 'rat deserting the sinking ship'?) The result is a non-vega version which is now at 'close to complete' stage at https://en.wikipedia.org/wiki/Template:OSM_Location_map/sandbox . It jumps through some very ungainly hoops, as it uses the Maplink overlay, but only seems to work if an en:overlay template also adds an invisible square. By re-using the mercator calculations developed to get vega5 working, it can add inline CSS graphics and text instructions on top of the map. (Betraying my ignorance, I had no idea CSS could be used like this). So far as I can tell, it appears to have a lower performace hit than Vega did.

There are a selection of examples at https://en.wikipedia.org/wiki/Template:OSM_Location_map/examples which also showcase some new features not possible with the old graph template. Any thoughts on the stability, performace, sustainability, portability and 'security safety' of this approach would be welcome. So far it only does 10 map-items. I am doing a few more compatibility/bug-find tests with existing map examples, and all being well will then ramp it up to the original 60 and go live in the next few days. RobinLeicester (talk) 12:13, 6 March 2024 (UTC)Reply

Nice work, though certain things sometimes aren't centered and the top-right button has a weird blue link overlaid with it. In the case of the photo-panel demo, going fullscreen takes you to null island. Aaron Liu (talk) 12:42, 6 March 2024 (UTC)Reply

Signpost article summarizing the developments of the last 11 months[edit]

I wrote an article for the Signpost trying to provide an overview of the discussions and attempted solutions for this issue since it occurred in April 2023, and to give a sense of where things might stand currently: w:Wikipedia:Wikipedia Signpost/2024-03-29/Technology report

Regards, HaeB (talk) 06:46, 31 March 2024 (UTC)Reply

Update April 2024 (discussion)[edit]

Hello everyone -- I posted an update with a proposal for moving forward with graphs here on the project page. I posted on the project page (instead of the talk page) so that it can be marked for translation to other languages. This is a thread to discuss, so please join in the discussion!

Also pinging some of the people who have been active in the discussions on graphs so far (but everyone else is also encouraged to speak up!) @Sj @TheDJ @Levivich @Bawolff @Theklan @Aaron Liu @Snævar @Nux @HaeB @Iniquity @Strainu @John Broughton MMiller (WMF) (talk) 23:04, 10 April 2024 (UTC)Reply

I very much like the proposed approach - the initial efforts will generate at least 80% of the value (the less complex graphs that the vast majority of editors might create, or have created) with the proverbial 20% of the effort that would be required to offer a solution that covers a much wider range of functionality. I definitely agree that "the larger topic of interactive content is worthy of separate, continued conversations moving forward". John Broughton (talk) 01:34, 11 April 2024 (UTC)Reply
Thanks for the update. The proposed approach of server-rendering the content and serving it as a static image sounds similar to how SVGs, LilyPond and LaTeX are handled today. I am wondering whether this essentially boils down to implementing phab:T334372? I'm no expert with SVG but I believe that's the format usually used for static graphs. Adding the ability to server-render SVG markup provided as an inline parser tag (or as a Commons data page) and then providing Lua libraries to easily author modules for turning graph data to SVG would make for a very extensible solution. SD0001 (talk) 04:50, 11 April 2024 (UTC)Reply
@SD0001 That’s an interesting idea, thanks for sharing that task, and yes I believe we would be rendering the static graphs as SVGs. My understanding is that there are still security concerns regarding inlining SVGs, but I see some workarounds are proposed in the comments. With this new graph extension, our goal is to have an editor-friendly graph definition interface that can be easily shared across projects. Another advantage would be that the code would be testable and reviewable. Essentially, if you wanted to add a new visualization type, you could do that by contributing to the extension rather than writing a new Lua module. Does that approach also seem extensible enough? CCiufo-WMF (talk) 23:10, 17 April 2024 (UTC)Reply
It should be noted that there is nothing preventing testing/reviewing in lua. It is somewhat less likely due to technical and political factors, but plenty of on wiki modules have test cases and get changes reviewed. Similarly, the original graph extension was developed by WMF, but i wouldn't say the test/review process was all that it should have been. I agree that sharing of lua between wikis is a major issue with on-wiki templates. Bawolff (talk) 19:06, 18 April 2024 (UTC)Reply
Oppose Oppose It doesn't solve the issue not in time, nor in purpose. I'm not going to say that this is disappointing, because I didn't expect too much. But definitively a bad move (not even a solution). Theklan (talk) 06:10, 11 April 2024 (UTC)Reply
Thanks for the update. I'm going to trust the process that got you to this conclusion and focus on the practicalities:
  • what I'll miss if not implemented: several data series on a single graph. Beyond that, the basic graphs you describe are OK for known usages on my wiki.
  • what I would like to know: a timeline for re-activating the feature.
Strainu (talk) 09:34, 11 April 2024 (UTC)Reply
If I read it correctly, the proposal dismiss the possitiblity of having graphs from Wikidata data. In Basque Wikiepdia we have around ~35K articles loading interactive graphs from population data that comes from Wikidata. Theklan (talk) 12:24, 11 April 2024 (UTC)Reply
For population data, Lua + inline data input for graphs (which is promised) should be okay. What’s lost is the ability to use SPARQL, but that’s not needed for population data: hu:Modul:Népességdiagram reads data from Wikidata and displays a graph from it (previously using Extension:Graph, now using CSS trickery) without a single line of SPARQL, only using Wikidata’s Lua interface. —Tacsipacsi (talk) 01:28, 12 April 2024 (UTC)Reply
I appreciate the update. Static images is the worst solution as you cannot hover to see hidden labels (it's not possible to have a compact image with all labels in large charts).
Not sure where do you get this part of information:
and we tried wrapping the Vega canvas in a sandboxed iframe (which caused significant performance issues)
This is not true. Caching is possible (even in Chrome). It's not trivial, but also not that hard. If you would dismiss iframe for accessibility reasons I would understand, but also static images are not accessible at all (not that you can really make a generic solution to make graphs accessible)...
Please talk to devs, I think there is some misunderstanding around actual iframe problems. Nux (talk) 11:02, 11 April 2024 (UTC)Reply
I want to clarify that static images is the worst long term solution. Wikipedia should - long term - be more interactive, not less. Static images would be ok as a temporary workaround. Nux (talk) 11:08, 11 April 2024 (UTC)Reply
With iframes we have been there, done that, as per phab:T169027. Read it and move on. Snævar (talk) 12:08, 11 April 2024 (UTC)Reply
@Snævar I know that task, but this is not the correct one. Also a lot has changed since 2017. In 2017 loading jQuery from common CDN still made sense and that was before Spectre...
But even though caching changed a lot - as I said - loading in an iframe works fine.
I tested this myself and as long as you setup the iframe correctly it is both fast and secure. Nux (talk) 08:50, 12 April 2024 (UTC)Reply
I think it is a bit ambitious. It is really only known whether this will work, when an appropriate graphic software is found. Personally I would put a caveat on all of this, pending on said software, just to avoid setting expectations that may or may not happen. I like that the proposal admits this will take months, it is the truth anyway.
I have heard some people being excited for having more functionality than has been used for Graph, but in the end, this is fine for WMF wikis. Snævar (talk) 12:07, 11 April 2024 (UTC)Reply
I suggest to visit Our World In Data climate change page, to see why this approach is not good now, and it would be out of date even five years ago. The future goes in another direction, spending another year to make a patch that will add little value is spending time, money and effort. Theklan (talk) 12:31, 11 April 2024 (UTC)Reply
  1. Since many past graphs won't survive the transition, and we are now committing to definitely, positively, not resurrecting existing Graph tooling: please render static snapshots of all of the past/current graphs.
  2. If starting from scratch: implementing the full OWID library seems a good idea. They have an active community of development, practice, and knowledge suitable for our projects; they have a best-in-class approach to making data visible alongside a visual; we have an active community of crossover use trying keenly to incorporate that knowledge.
  3. as interim ways explore future interactivity: enable task T303853 to see what happens; other wikis could have a server-rendered implementation of OWID, with a link that takes you to a toolforge exploration of that same graph to get a custom interactive view of the data, and a way from that tool to render a new server-side image + template text to embed that. Sj (talk) 14:03, 11 April 2024 (UTC)Reply
    We have deployed the OWID gadget at Basque Wikipedia, and the result is impressive. We have added in one day interactive graphs to 24 articles (eu:Kategoria:Our World in Data grafikoak dituzten artikuluak). This is a really powerful software piece, and having the ability to reuse it with other data sources would be a huge step forward. Now, we need to figure out how to translate the software itself (and also the data pieces). Theklan (talk) 19:26, 17 April 2024 (UTC)Reply
    Thanks for the suggestions @Sj, I’ve tried to address some of these points in my general comment. Regarding what to do about existing graphs in the meantime, one of the options we’ve considered is to render static snapshots. One unknown here is what to do if the graph definitions are changed, since we wouldn’t be able to update the rendered image. I’m also not sure we’d be able to render all the existing graphs as static images. Assuming we were able to though, would you expect the images to be inserted alongside the existing graph definitions instead of replacing them? What do you think should happen with the existing error messages communicating that graphs are unavailable? CCiufo-WMF (talk) 23:17, 17 April 2024 (UTC)Reply
Hello CCiufo, I expect the static render would be from the last point in time that had a graph definition, as of the render. Not updated after that. I would expect the static images to replace the graph definitions, with the full image description (on commons) and possibly the caption (where appropriate) linking to the replacing diff. That would highlight in diff view the graph definition that was used. The existing error messages are n eyesore and should go away once the graph definition is replaced by an image. Sj (talk) 03:36, 18 April 2024 (UTC)Reply
Thanks @Sj, I appreciate the clarification and suggestions. I’ll include a section about this option when I update Extension:Graph/Plans, after we’ve had a chance to evaluate the technical implications in more detail. I’m assuming different wikis may want to handle this in different ways and even within a given wiki, I think getting consensus about doing this type of mass-update will be tricky and might distract from focusing on the development of the replacement extension. Do you get the sense that there already is consensus to do this on some projects? CCiufo-WMF (talk) 20:54, 25 April 2024 (UTC)Reply
[The following is a bit rambly and incoherent. Sorry] The most recent update is a bit vague to be honest. It would perhaps be helpful to include some anti-goals - what do we not want to do, as well as maybe some persona-style use case discussion. Then again perhaps this is not the right venue for that. Ultimately though I feel like I'm pretty unclear on what the solution will look like just based on this. Its also still a bit unclear to me what subset of the "graph" problem we are trying to solve. Maybe giving some concrete examples of use cases would help. Some questions to think about:
  • This will be static images only:
    • I assume we aren't going to just resurrect graphoid and call it a day. While I personally agree with that decision, i think the rationale for not doing that should be fleshed out.
    • It feels like we are talking about interactivity as a binary - Its either just an image, or we do full blown Turing complete scriptability. I think it would be useful here to distinguish levels of interactivity and the various use cases they have
      • Just an <img> tag, like an uploaded file.
      • Links that can be clicked on (e.g. Easytimeline)
      • Links, tooltips and hover effects (Essentially what you can do with wikitext using CSS pseudo-selectors :hover, :active, etc. HTML title attribute)
      • declarative animations. Like CSS animations (@keyframes and friends) or SMIL used in SVG
      • Full blown scripting.
    • It seems like we are going with just an <img> tag. I'm not sure that is sufficient. It seems like one of the main selling point of old graphs was the ability to do hover effects - mouse over the bar in the bar graph and it shows more data about that data point, and that sort of thing. I've also seen drill-down effects be very effectively used on graphs elsewhere on the internet (e.g. flame graphs)
  • What actually is the value proposition of graphs (in general)? The way I see it, its a combination of the following things:
    • Easy to edit histories integrated like normal page edits so users can easily track changes
    • Automatic Ingestion of dynamic data sources (e.g. page views. WDQS). [This however is problematic in its own way]
    • semi-interactive displays that look nicer than static uploaded images [Even if not fully interactive, things like tooltips and highlighting the line you are hovering on]
    • Separation of presentation & formatting concerns (You can put the data in the data namespace on commons, and have the code separate. Unlike an uploaded svg file, where editing it might be complicated because everything is mixed together).
  • I can't help but wonder - perhaps there is no need to do anything special here. Why not just whitelist in wikitext a subset of useful safe SVG tags, allow template styles to do CSS animations [already allowed], allow lua to format data sources, and call it a day? (Like what SD0001 is suggesting) We are basically already there, and it seems like this would already go beyond what is being proposed here. The only additional things to do would be to make JsonConfig data namespace suck less, and optionally allow lua to fetch dynamic data sources (if we decided we actually wanted that).
Bawolff (talk) 20:47, 11 April 2024 (UTC)Reply
To allow more interactivity with CSS we should solve this: task T360725 -Theklan (talk) 06:05, 12 April 2024 (UTC)Reply
While that would be nice, i don't really think that's critical for graph-like usecases you would expect to find on wikipedia. Bawolff (talk) 19:13, 12 April 2024 (UTC)Reply
You are right, it's not critical, but having an up-to-date CSS would be interesting if we want to achieve some kind of modern view, which this proposal dismiss. Theklan (talk) 19:22, 12 April 2024 (UTC)Reply
@Bawolff I like the way you’ve broken down the different problems the legacy extension + templates were trying to solve, and for highlighting the nuance of interactivity. I think we can borrow a lot of that to better communicate what we’re trying to achieve with this new extension. I’ve attempted to clarify things in my general comment, but when I update Extension:Graph/Plans, would you mind if I pinged you to see if the goals, scope, and audience is clear?
Regarding SVGs + Lua, please see my reply to SD0001. In short: there are advantages to going the extension route, but I’m happy to continue discussing this. FWIW, I don’t think the proposal to create a new extension excludes the possibility of phab:T334372 happening in the future. Maybe a combination of both will cover a wider net of use cases over time. CCiufo-WMF (talk) 23:15, 17 April 2024 (UTC)Reply
Yes, feel free to ping me whenever you want. I agree that the two approaches are orthogonal and we can potentially persue both. I personally think things where we put as much as possible into the hands of the user work best. There is always a lot of barriers once stuff is being done in gerrit. Sometimes that is necessary, but i think the lesson of lua is that giving users the low level tools directly and ways to abstract on top of them is very powerful. We just get so much more creativity when users are allowed to experiment freely. Bawolff (talk) 18:52, 18 April 2024 (UTC)Reply
I second John Broughton's overall optimism about the proposal. I've been missing the graphs over at enwiki this past year, and the proposed solution would replace most of what I found most editors (even if clearly not all) to use the Graph module for. Tserton (talk) 10:34, 13 April 2024 (UTC)Reply

Hey everyone, thanks for taking the time to read the proposal and continuing to provide your thoughts here. As Marshall mentioned, I’ll be leading this effort and I’m eager to work with you all to help shape the future of graphs in Wikimedia projects. I’m glad to hear some of you believe we’re on the right track with this.

As a first step, I’ve distilled some of the common questions I’m seeing in the conversation so far and have provided answers to them below. For some of the more specific questions, I’ll be replying directly.

Why only images? What about interactive graphs?

The primary motivation for sticking with static visualizations is to get to a working graphs solution as soon as possible. We want to be realistic about what we can deliver in the next fiscal year and we are confident that serving images will be performant and secure.

I know that not all existing use cases will be covered by this solution at first. @Bawolff has done a great job explaining that “interactivity” is a spectrum, and for now we are targeting the simple side. That’s why we plan on designing a system that leaves the door open for adding what we’ve been calling “light interactivity”. This could include things such as hovering over charts and maps.

Why not enable use of inline SVGs or some other visualization library directly?

We intentionally want to provide our own graph definition interface (i.e. how you would actually specify a graph in wikitext), for a few reasons:

  1. We learned from our security team when trying to re-enable the legacy extension that the ability to directly access the underlying visualization library can be fundamentally insecure, so we won’t be repeating this pattern going forward.
  2. Creating our own graph definition interface at the extension level will provide stability and freedom to upgrade or even change the underlying library in the future without having to worry about breaking existing graphs. For instance, if we were to start out using Library A, but then years later discover a major problem with Library A, we would want to be able to switch over to Library B without having to rebuild the whole extension. This was a challenge when we tried to upgrade from Vega 2 to Vega 5. The syntax had changed and therefore required a migration of the graph definitions too.
  3. Changes to graph definitions and supported graph types then become observable and testable through standard software development processes in a way not possible currently with templates and modules. Essentially, if you wanted to add a new visualization type, you could do that by contributing code to the extension where knowledgeable volunteer developers or WMF staff could review it. This invites those with the necessary technical expertise to extend functionality in a safe and scalable way.

What will this actually look like? Which visualization types are we thinking about?

We’re thinking that it would be most important to address the use cases previously covered by Graph:Chart and related templates like Graph:Lines, but without the ability to make MediaWiki API calls or SPARQL queries to Wikidata Query Service for now. The reason I haven’t specifically called out which types of visualizations we want to pursue first is because this is a key area we’re looking to get your input on. Without looking at every single graph definition, it’s hard to know for sure what would be most useful. We’ve been looking at template usage across all projects to help identify important use cases, but are there other factors you think we should be considering? Are these the most important graphs for readers? For editors?

For some use cases we don’t think we’re likely to support, like rendering pageview data, could it make sense to use existing tools like https://pageviews.wmcloud.org/pageviews/ instead? Let me know what you think!

What is the plan?

The rough plan is as follows:

  1. Before the new fiscal year starts in July 2024, we hope to select a visualization library, finalize staffing, and work with you all to identify the initial graph types we want to support and define what the interface could look like. This will all be included in a refreshed version of Extension:Graph/Plans.
  2. When the team starts work in July, set up the infrastructure needed to render a single graph type. Out of transparency, it’s hard to estimate how long this work will take before we have a clearer idea of what the system architecture will be, but we will keep the project page updated with our best estimates.
  3. Pick a graph type and prototype it with community members to finalize the graph definition interface.
  4. Once the graph definition interface and design are settled, test it and make it available in production.
  5. Repeat steps 3-4 for subsequent graph types identified in step 1.

At some point we’ll also decommission the legacy graph extension, but I’m not sure when it would make sense to do that yet. What’s also missing in this plan is at what point existing graph definitions are migrated to use the new extension. We really can’t do this part without your help, even if we find ways to automate some of it. I’m thinking it makes sense to do it iteratively as each new visualization type becomes available (step 4), instead of a mass migration effort at the end. Let me know if you have suggestions about this.

I’ll share more information about where we are in the process and communicate any decisions we make as early as I can. Like I mentioned in the rough plan above, we’ll be seeking your input specifically on which visualization types are most important to start with and what the graph definition interface should look like. For now, I’d like to continue the conversation here. Thanks again for taking the time to help us think through this! CCiufo-WMF (talk) 23:06, 17 April 2024 (UTC)Reply

So, as I understand, your answer is that you don't care about what we discussed above, and you will be doing what you thought first. Why are you and the team asking for feedback if the purpose of the feedback is to dismiss it? Don't waste our time, if there's no point on doing that. Theklan (talk) 09:06, 18 April 2024 (UTC)Reply
Disagree. Talking to the volunteers is always a good thing, even if WMF chooses to go in a different direction. –Novem Linguae (talk) 09:30, 18 April 2024 (UTC)Reply
I appreciate providing a concrete plan for how the work will be done going forward. I do hope that as part of this, some time will be spent gathering user stories for more ordinary Wikipedia editors. I'm a bit worried that people participating in this discussion might not be representative of Wikipedians writing articles (This is always true, but i feel like it might be even more true in the Graph discussions then most discussions where a lot of people have an image of an ideal future in their mind of interactive content that might be disconnected from the needs of the moment. Myself especially included in that). p.s. As a small nitpick, I am unsure what you mean by "decomissioning" the legacy graph extension. It has already been decommissioned for about a year now. Bawolff (talk) 19:01, 18 April 2024 (UTC)Reply
You raise an important point about who we design for. This is always a challenge when we build things at WMF and I’m going to be explicit about the audiences we’re focusing on for this new extension as part of outlining the project scope in more detail on Extension:Graph/Plans. As you’ve pointed out already, keeping focused allows us to avoid building for such a wide range of users and use cases that we end up not solving for any of them at all. It would be great to chat with more editors who’ve used the legacy extension (either directly or through templates) but aren’t active in these discussions. If you have ideas about the best way to identify and engage such editors, I’m all ears!
Re: “decommissioning”, I meant formally deprecating it and pointing people to the new extension or other alternatives. I think there’s a lot of documentation / phab task cleanup needed here to communicate that the legacy extension is not coming back. CCiufo-WMF (talk) 20:56, 25 April 2024 (UTC)Reply
Hmm... Well, if there were a serve-side rendered SVG with labels shown on hover (and tap), that might actually provide a decent user experience. I think stacked graphs might be something worth doing first. They were the hardest to replace (IIRC I degraded most of them to line charts mostly, unfortunate loss for readers). Stacked graphs are also probably a good starting point to solve some problems with how to define series and how to display a compound data point on hover. Nux (talk) 21:20, 18 April 2024 (UTC)Reply
^timeline charts, not line charts. Nux (talk) 16:57, 19 April 2024 (UTC)Reply
I can’t speak to exactly what SVG capabilities will be available yet, though that would be ideal. I think it’ll depend on the underlying visualization library we use to render the graphs. Regarding “stacked graphs”, do you mean a stacked bar chart, like this? CCiufo-WMF (talk) 20:57, 25 April 2024 (UTC)Reply
Yes, or stacked line charts, which should mostly have the same problems of displaying labels. Some utils draw a vertical line to better show which points are highlighted and then show labels for all of them. Nux (talk) 22:39, 25 April 2024 (UTC)Reply
On the subject of page views. I do not think the toolforge tool is a suitable replacement. That said, i don't think its that important a usecase. However since we already have stuff to get the data on wiki, i think we should just expose it to lua. I filed phab:T362937 for that. At some point, we may want to replace the ?action=info code in extension:PageViewInfo to use the new system whenever it exists. Bawolff (talk) 21:59, 18 April 2024 (UTC)Reply
I agree that porting Extension:PageViewInfo to the new system would make sense instead of introducing a graph type specific to pageview info, for the reasons you mention. CCiufo-WMF (talk) 20:58, 25 April 2024 (UTC)Reply
Could there be a stopgap measure whilst the new extension is developed? Such as:
  • Displaying the information in a table (simple, only use for low data simple graphs such as bar, line or pie, have a cap on number of "rows", maybe 20-30, this could also be an expando)
  • Implement a solution to have a user click on a link which takes them to an extenal/internal service which will graph the information on a separate page. There are open source services for this which wouldn't take *too* long to first audit security and code wise, then set up either internally or externally hosted. A simple example showing its technically possible: Quickchart.io, which just takes the information in the URL and gives you a chart.
  • I'm sure this has probably already been discussed, but could you statically render the existing graphs offline (to mitigate security issues) then show them on the pages with a note "This graph has not changed since 19 April 2023 and is unable to be changed due to security issues, so information might be outdated". And just bar any changes to existing graphs. I'm not sure how technically involved this would be and if it would be worth it versus how long the new extension will take. For example, if its going to take 1 year and this stopgap would take 1 month, it'd be worth it, but if this stopgap would take 6 months and the dev time would be 1 year, it wouldn't in my opinion.
Building a whole new extension is quite obviously going to take some time, not starting until July, and as of today it has been 1 year since Graph was disabled. Information is not going to be available for potentially years after it last was. I feel as if there should be some way to at least see the underlying data whilst the new extension is developed, even if not visually. MarkiPoli (talk) 07:27, 19 April 2024 (UTC)Reply
The community could probably do a table of the data. Like I covered in #Stats, around half of the pages that use graphs on en.wp (your main wiki) are using Module:Graph, so it should go there. It is something you should discuss on en.wp.
Making a tool that displays old graphs on wmflabs.org is technically possible, but we are not going to show old graphs on wikipedia. There is too much information on the security risks the old graph system (Vega2) has, that it is just out of the question. An user could create this tool on wmflabs.org.
Taking a time from devs to do a stopgap is going to delay it. Looking at other WMF projects, at least for 3 months, possibly 6 months. I am primarily looking at small and medium sized Wikipedias, some of which have limited technical knowledge. They are not going to put up a stopgap, but just wait for the main one, and waiting longer does not make sense to them. Snævar (talk) 08:52, 19 April 2024 (UTC)Reply
We’re supportive of finding intermediate/stopgap solutions where there’s a clear need and buy-in to actually implement them. I provided some more context in my reply to Sj, but yes delaying work on the new graph extension is one of the concerns. CCiufo-WMF (talk) 21:00, 25 April 2024 (UTC)Reply
I would say the order of graph types is Lines, stacked bars, pie. My section at #Stats explains why. Probably normal bars need to go in-between lines and stacked bars, since stacked bars depend on bars. Any precise timeline of how long it takes the community to convert graphs is dependent on the graph definition. As a refresher, I did say moving from Vega2 to Vega5 would take 3 months, and this move is probably going to take at least that, likely more. Snævar (talk) 08:58, 19 April 2024 (UTC)Reply
That stats might be misleading a bit. It's been a year and communities probably decided to move on. I know plwiki did.
  • I created the piechart module which might even be better then Vega in some cases and I think all piecharts on plwiki are now replaced. Ported this to enwiki too.
  • Timeline can act as a replacement for bar-charts. Those are static images so they get crowded quite fast... But it is possible and I've done that for population charts, I'm guessing enwiki did too.
  • Line-charts can be replaced with bar-charts if you can remove some data points (e.g. show 5 year periods instead of showing progress every year).
  • Stacking: you could do that with timeline, but calculations are messy but maybe some communities found ways to craft that. I think I just separated most of stacked charts on plwiki.
I also remember removing some charts because there were just too many data points and the charts wouldn't be readable as a static image... So yeah, I think those stats are probably off by quite a lot. They can be helpful, but don't read them too literally. Nux (talk) 16:16, 19 April 2024 (UTC)Reply
(sarcasm)Right(sarcasm ends). You have not seen phab:T137291, where easytimeline would be removed. It is upto each community what they do, just because one does one thing, does not mean that all of them need to follow suit. Easytimeline handles big numbers poorly, they need to be scaled for it to work at all. Both D3 and OWID are more feature rich than EasyTimeline, or I should say Ploticus, the software behind it. Ploticus is not being developed and easytimeline gets minimal attention, just so it is known. I am not looking for suggestions for solutions at all. You are welcome to update those stats, I will not do it. Snævar (talk) 23:29, 19 April 2024 (UTC)Reply
Yes, timeline was worse, that's I actually agree 😉. On plwiki we did migrate to Graph... And then migrated back. I'm just saying most uses of timeline were probably Graph/Vega uses a year ago. Hence that stats need to be taken with a grain of salt. Nux (talk) 11:39, 20 April 2024 (UTC)Reply
Thanks for the suggestions on which graph types to start with @Snævar, that’s really helpful! Regarding converting legacy graphs, can you elaborate more on what your concerns are about why it would take longer? Do you mean that if the new definition is quite different from Vega’s, it’ll just take longer because each graph will need more attention to rework? CCiufo-WMF (talk) 21:01, 25 April 2024 (UTC)Reply
It just feels like nothing has moved on in the year since the vuln was found. I mean, from what I read, the idea for most of that time was to update Graphs to use a newer version of Vega but now it's "replace Graphs"? Sounds like scope creep to me. Eilidhmax (talk) 13:31, 19 April 2024 (UTC)Reply
Essentially they looked into it, and came to the conclusion that updating to a newer version of vega would not satisfactorily fix the problem. Bawolff (talk) 20:26, 21 April 2024 (UTC)Reply
  • As someone who works professionally in data visualisation, I'd be very sad to lose the Vega interface, which (via inspiration from Wilkinson's grammar of graphics and Wickham's R implementation ggplot2) is the product of a great deal of thought and refinement by statisticians. Without at all meaning to denigrate the WMF tech team (it's just a difficult problem), I don't think that they can realistically develop anything that it is anywhere near as good in-house. The update mentions rendering server-side would avoid known or substantial security risks, such as those in the legacy Graphs extension – so why not just use server-side vega? Joe Roe (talk) 16:04, 21 April 2024 (UTC)Reply
    Staticians by and large aren't the people editing articles, and not the only use case a graph extension needs to meet to be good. Vega could be great for data professionals and still a bad fit for us. I think this is demonstrated by how for all vega's power, essentially none of that was used in all the years it was actually enabled. Almost all previous uses were of one or two basic types. If the power of vega was actually useful to Wikipedia's usecase, someone presumably would have used it in the last decade. The fact nobody did suggests it was not useful in practise. Bawolff (talk) 20:24, 21 April 2024 (UTC)Reply
    We did at Basque Wikipedia using interactive climate graphs. So someone used it. The fact that we didn't use it more is also related to having Vega 2 instead of Vega 5, with higher capacities. That said, the sentence could be great for data professionals and still a bad fit for us needs a definition of us. Because, as far as I know, us should be the central repository of free knowledge. Theklan (talk) 06:16, 22 April 2024 (UTC)Reply
    Sure, that is a reasonable definition of "us", although i would probably just simplify "us" to mean "wikipedia" (sorry sister projects). I think its good to define usecases to what we really need, since if we try to do everything we end up not being able to do anything well. Bawolff (talk) 06:28, 22 April 2024 (UTC)Reply
    There are two problems for that definition. First, we are not "Wikipedia". Second, the discussion here is not about what we need, but about what we currently can, what is the worst scenario for strategical thinking. Our needs can't be constrained by what our engineers could solve within a fiscal year. Theklan (talk) 07:22, 22 April 2024 (UTC)Reply
    The two aren't disconnected though. The riskiest part of any software project is a disconnect between what users need and what developers think users want. Its one of the main reasons software projects fail across the industry. The lack of a coherent vision around graphs between different stakeholders makes this an incredibly risky project. The risk reduces the amount of things that the foundation can do in a fiscal year. We could do a lot more if we all figured out what we actually want/need. Bawolff (talk) 19:14, 22 April 2024 (UTC)Reply
    The main risk is having something unusable for one full year, and then bringing back something that doesn't solve the issue, and makes even more difficult to improve things in the future. It is going backward, not just trying to be where we were. The idea of doing things that "fit in a fiscal year" is also the worst possible way of thinking. We can't afford a Foundation that only thinks on things that are solvable in a 9-month-window. That's completely destructive, contrary to any strategical thinking and the way projects die. Theklan (talk) 06:40, 23 April 2024 (UTC)Reply
    I'm not sure WMF said they were stopping at the fiscal year - they just wanted to have a first version out by that time and thought that would be a good milestone to plan to. Presumably after that point they would re-asses, see if they are on the track or need to change direction and if further development is warranted. Planning development effort for like the next 5 years is generally a bad plan. Its hard to predict the future and its better to plan in short intervals (Agile!) so you can adjust to changing circumstance. That doesn't mean once the plan ends you are done. Regardless, timeboxing is a very common risk mitigation strategy to deal with uncertainty in software development amidst competing goals and would totally be reasonable here. I think you have the cause and effect backwards here. Limiting the planning to 9 months isn't the risk - it is what one does when nobody can agree on what should be done and the foundation needs to limit the risk of an open ended project that might never complete and might never make anyone happy. Smaller projects limit the risk of going too far down the wrong road before course correction can take place. Like you said earlier - it has been a year since graphs have been removed. In that year all anyone has done on the community side is talk past each other on what is needed (see the wikimedia-l mailing list for example). It would be different if there was widespread agreement among the community about what is needed along with evidence that such a solution would be applicable to a large number of articles (say >1% overall or >5% of featured articles). But that is not what happened. I suspect that makes it hard to justify doing a really large multi-year project like you want. Bawolff (talk) 15:11, 23 April 2024 (UTC)Reply
    Users could talk among themselves and reach an agreement among themselves. Most of them can be asked the question of what is useful, but not what software to use. That is the problem, the questions in September or August where becoming increasingly software related, so there was no answer. The percentage of graphs on English Wikipedia would not reach 1% of all articles. On english wikipedia, there are 18k pages with Vega graphs, some of which are on talk pages. If I where asked about if I care about a feature that affects less than 1% of articles, I would say no. So any statistical post I have made is intentionally in numbers, not percentages, because that is a better selling (convincing) point. Percentage of wikis with more than 10 graphs would also be a high number.
    In therms of finding those counts, we do not have the tools for it. Even with this graph feature, that is on a less than a percentage of articles, I was close to hitting the limits of Global search (global-search.toolforge.org). Really, the only way to get results of all graphs, is to visit the search of every project, and there are hundreds of them. So I am forced to take a subset of graphs and make stats out of those. Snævar (talk) 18:47, 23 April 2024 (UTC)Reply
    A proper search could search for these, in the existing Vega graphs:
    line graph (either linear or point type):
    scales[0].type = linear
    scales[1].type = linear
    scales[0].type = point
    scales[1].type = point
    bar chart:
    marks[0].type = rect
    pie chart:
    marks[0].type = arc Snævar (talk) 05:37, 26 April 2024 (UTC)Reply
Oppose Oppose The way to solve this is not to reinvent the wheel and build a new visualization tool. There are zero chances that a self developed tool can be as good as existing tools that are well-maintained and developed for years. This proposal is actually kind of concerning for me, because it goes against any good practices in software development and gives me even less hope (if possible at this point) that the WMF knows what they are doing --Ita140188 (talk) 06:29, 25 April 2024 (UTC)Reply