User talk:Johan (WMF)

Wishlist
Are you keeping a list of ideas for future wishlist discussions? Here's one from me that I believe is new:

I want a script or tool that tells me who wrote most of the content on a given revision of a page. I'm interested in figuring out who actually wrote visible sentences and paragraphs of content, not who formatted citations or inserted infoboxes (both of which can add a lot of bytes). It also needs to be actually creating the content, rather than, e.g., undoing page blankings (which has a large positive byte number, but doesn't actually result in any content).

Unlike w:en:Wikipedia:WikiTrust, I don't care who wrote which specific words, although presumably you would need a similar mechanism to determine the contributions. Instead, I want a list of contributors from the most to the least, or perhaps a percentage for the first handful. This would be useful for compliance with the BY aspect of the license (if you copy it to some forms of media, you need to name the five most significant contributors) and also for statistical work on contributions (allowing us to separate "who wrote the most content" from "who did the most formatting or reverting"). Whatamidoing (WMF) (talk) 19:20, 11 January 2016 (UTC)
 * Not really, but since you've asked me twice now, I should probably get the hint and set one up.
 * I think this could build on research that User:EpochFail is looking/has looked at. /Johan (WMF) (talk) 07:46, 12 January 2016 (UTC)
 * This is surprisingly difficult to track authorship since it require substantial computation for large pages. However I have performed the necessary computations on XML dumps using batch-style large-scale computing systems (e.g. en:Hadoop).  Once I finish with my analysis work (see. m:R:Measuring value-added), the next step is to try to implement it as a live system that synchronizes with a wiki via recent changes.  In the meantime, there are systems that generate stats like this on-demand.  See http://people.aifb.kit.edu/ffl//whovisual/.  You'll be in for a long wait if you try to generate the authorship for the current version of en:Anarchism, but it's worth testing out on pages with less history.  --EpochFail (talk) 14:00, 12 January 2016 (UTC)

Away for a couple of weeks
I won't be around for a couple of weeks. I'll be back on March 26. If you want to reach me, just write here and I'll reply once I'm back. If you can't wait, you can contact the other liaisons. If it's about Tech News, write on m:Talk:Tech/News. /Johan (WMF) (talk) 20:56, 9 March 2018 (UTC)

News for the book editor ?
Hi,

Do you have news about the "Chromium based" book editor ? Simon Villeneuve (talk) 15:50, 5 April 2018 (UTC) I understand it's not really your fault, but I wanted to let you know my frustration. Simon Villeneuve (talk) 17:38, 9 April 2018 (UTC)
 * Hi! In short: It didn't work as we had hoped, which has caused unfortunate delays. I had hoped to be able to have a proper update last Monday, the new deadline is "hopefully this week but next at the latest". The reason the news drags out is that we have to make sure we don't say the wrong things about other people. (: /Johan (WMF) (talk) 15:52, 5 April 2018 (UTC)
 * Ok. Thank you ! Simon Villeneuve (talk) 16:24, 5 April 2018 (UTC)
 * Simon Villeneuve: The page has now been updated with what we see as the way forward, and a short explanation of what hasn't worked for us with the original plan. /Johan (WMF) (talk) 17:21, 9 April 2018 (UTC)
 * Ok. I'm really disappointed. It's been years now that I ask everywhere for a good book renderer for the wikis of the Foundation. I'm a teacher and a big part of our problem of credibility in schools is the lack of good tools to easily create good printed documents. A lot of school teachers like to have paper in hands and a pure player can't expect to have a good place in classroom. I was waiting since the beginning of the year to provide a .pdf of my French school book about Wikipedia in Education and now I must say to my publishing house that we have waited for nothing. I also try to give decent printed lecture notes to my students in astronomy but the .pdf of it is like sh... for many years now.
 * Ok. I'm really disappointed. It's been years now that I ask everywhere for a good book renderer for the wikis of the Foundation. I'm a teacher and a big part of our problem of credibility in schools is the lack of good tools to easily create good printed documents. A lot of school teachers like to have paper in hands and a pure player can't expect to have a good place in classroom. I was waiting since the beginning of the year to provide a .pdf of my French school book about Wikipedia in Education and now I must say to my publishing house that we have waited for nothing. I also try to give decent printed lecture notes to my students in astronomy but the .pdf of it is like sh... for many years now.
 * Ok. I'm really disappointed. It's been years now that I ask everywhere for a good book renderer for the wikis of the Foundation. I'm a teacher and a big part of our problem of credibility in schools is the lack of good tools to easily create good printed documents. A lot of school teachers like to have paper in hands and a pure player can't expect to have a good place in classroom. I was waiting since the beginning of the year to provide a .pdf of my French school book about Wikipedia in Education and now I must say to my publishing house that we have waited for nothing. I also try to give decent printed lecture notes to my students in astronomy but the .pdf of it is like sh... for many years now.
 * Simon Villeneuve: I get the frustration. The Foundation is constantly understaffed for what the communities want us to do (and what we want to do for the communities), so plenty of things take a very long time (or never succeeds – sometimes you work on something only to realise you've hit a wall). There's this constant need to prioritize among the things we'd like to do, which means that a function that is important to a certain group but not used by very many is difficult to spend as much time on as we'd need to fix it well. It's frustrating for us as well. Not that this helps you in any way, of course. /Johan (WMF) (talk) 17:47, 9 April 2018 (UTC)

Thank you for the honest update, I can understand the frustration of the WMF staff. However I don't understand why you don't want to promote mediawiki2latex as an alternative. this is what I get with

File:Livre.pdf

If you know a little bit of LaTeX, you can easily remove unnecessary stuff and modify the layout and placement of figures according to your needs.--Debenben (talk) 13:25, 10 April 2018 (UTC)
 * Hi and thank you for your work, but this version is far worse than [//fr.wikibooks.org/w/index.php?title=Sp%C3%A9cial:ElectronPdf&page=Wikip%C3%A9dia+en+%C3%A9ducation%2FTexte+entier&action=show-download-screen the inappropriate one] already given by the actual tool. It is 421 pages long (compared to ~130), there is a lot of footnote in every pages, the images appear everywhere and/or take too much place, the book cover is situated after the table of contents, etc. Simon Villeneuve (talk) 13:46, 10 April 2018 (UTC)
 * The credit should go to the authors, I only discovered it a few weeks ago. The advantage is that you get the LaTeX source code. I just used the default options, for someone that knows a bit of LaTeX making it fit on 50 pages by removing all footnotes, credits, licence information, decreasing font sizes or two-columns and printing all headings with comic-sans takes around 10 seconds of work. Everything that is not disabling or changing something globally, like resizing and rearranging images individually would be a lot of work, but at least it is possible, so I could fix every issue myself. Also, for articles with mathematical equations, the other pdf-tool is pretty useless.--Debenben (talk) 17:01, 10 April 2018 (UTC)
 * The credit should go to the authors, I only discovered it a few weeks ago. The advantage is that you get the LaTeX source code. I just used the default options, for someone that knows a bit of LaTeX making it fit on 50 pages by removing all footnotes, credits, licence information, decreasing font sizes or two-columns and printing all headings with comic-sans takes around 10 seconds of work. Everything that is not disabling or changing something globally, like resizing and rearranging images individually would be a lot of work, but at least it is possible, so I could fix every issue myself. Also, for articles with mathematical equations, the other pdf-tool is pretty useless.--Debenben (talk) 17:01, 10 April 2018 (UTC)