Talk:Requests for comment/Opt-in site registration during installation

Wiki description
This information would be good to have in core, and be output in on the main page. That would make Google and Bing display that description when the main page is on search results (for example, searching for the name of the wiki). It should be a small sentence, though, different from a hypothetical (mediawiki:aboutpage). This was done on Wikia years ago, where wikis can edit that on MediaWiki:Description. --Ciencia Al Poder (talk) 09:45, 24 September 2013 (UTC)

Data sent
I don't understand the proposed data. Why send data that we can easily retrieve via the API? In essence, the only thing we need is the API endpoint URL (or index.php for older wikis and wikis where it's disabled, but those are probably both hopeless), plus some other unstructured data not in the API (as a description, and for many wikis the copyright status) and perhaps some server environment information (mentioned e.g. in wikitech-l/2013-October/072185.html). --Nemo 00:04, 2 October 2013 (UTC)
 * The reason to offer to send the data to a central collection point is so that we can find out about new wikis without having to discover them. This would also allow us to collect information on wikis that are not publically accessible if the wiki creators allow it. — ☠ MarkAHershberger ☢ (talk) ☣ 02:43, 24 April 2014 (UTC)

Publicity of the data sent
Generally, I think we need this data desperately in order to improve MediaWiki for 3rd parties. Just a few remarks: --Mglaser (talk) 21:21, 14 October 2013 (UTC)
 * Some (if not many) of the wikis will be inhouse. They can send a ping to a WMF server, but cannot be polled for data (re: Nemo, "Data sent" ;) ).
 * Inhouse vs. public might be a piece of information we want to poll (could be done automatically via IP (?) or be checked). I think this information is very valuable.
 * I know (at least one) very security sensitive administrator. He might be willing to send anonymous stats, but wouldn't want them to be publicly available. So I'd prefer a two-step opt in: send data and allow the data to be public.
 * As Mark scetched out, this is tightly tied to the question of wiki spam.
 * How about continuous statistics, like number of pages, users, edits?

I second this. I administrate a very large corporate wiki (for internal documentation) and I am sure the privacy issue will come up. I'd love to be able to configure this information in various ways (what is sent, how, etc.)

--Daniel Renfro 16:59, 30 May 2014 (UTC)

Two comments from Dan Garry
This looks to be a nice way to provide some statistics about who's using MediaWiki, and where it's being used. I have two concerns, though.


 * 1) There are potential privacy implications to this depending on the nature of the information that is sent. The current resolution, namely making the person agree to send the information, does not absolve us of the responsibility of handling that data properly. Third parties are sending us data, and we should treat that seriously. In the present implementation it looks like this is handled though, as there is not really any private information sent. A declaration of what we'll use the data for would be nice.
 * 2) "Send data on your wiki to Wikimedia?" has the potential to be taken the wrong way, like we're some central authority on wikis or something. I suspect that this problem can mostly be solved by the wording of the text presented to the user, and I can potentially some support for refining the text in my role as product manager for platform. Please feel free get in touch if you'd like my help.

Thanks!

--Dan Garry, Wikimedia Foundation (talk) 19:17, 15 October 2013 (UTC)
 * Thanks for the offer of help. I will probably ping you on policy, etc. — ☠ MarkAHershberger ☢ (talk) ☣ 02:54, 24 April 2014 (UTC)

Some comments
1) I don't like the idea of all this data being sent at installation. Installation is not a final indicator of the wiki's configuration. Lots of configuration is done after the installation process. So all that data sent – which is rather redundant given it's available from siteinfo anyways – on installation could be completely wrong. I would prefer to see minimal information sent in the ping (just stuff like api url) and then have the ping server fetch the information itself from the wiki's API after a delay. And the information being introduced here that isn't currently part of siteinfo should be added to siteinfo instead of depending on the ping for it.

2) I also like the bit about the Tagline. From what I understand MediaWiki:Tagline is not a tagline like the ones used in WordPress and other websites but a string intended to identify the wiki when the page is printed. Which is starkly different than what the attempt here is to treat the tagline as and "raise the visibility" of.

3) Also when implementing the backend I'd like to see it served from a *.mediawiki.org domain name.

Daniel Friesen (Dantman) (talk) 07:15, 10 November 2013 (UTC)
 * 1 — agreed, there should be opportunities to “update” the data if nothing else.  Perhaps there could be a way to say “oh, you have this information — tagline? — that you didn't have before.  Send it now?”
 * 2 — The thanks here go to Jamie of Wikiapiary since he proposed the tagline and said it would be useful.
 * 3 — The core of the backend would be hosted on a WMF-affiliated, probably *.mw.o domain name.
 * — ☠ MarkAHershberger ☢ (talk) ☣ 02:52, 24 April 2014 (UTC)