Talk:Requests for comment/Extension registration

Drawbacks
Besides migration costs, what are the (potential) drawbacks of implementing this proposal? --MZMcBride (talk) 04:38, 24 May 2014 (UTC)
 * Parsing JSON is probably slower than loading a PHP file. We can mitigate this by caching the parsed result.
 * You can't use things like namespace constants in a JSON file. While this doesn't actually break anything (we can just use the actual number), it will hinder readability.
 * There are probably more, but that's what I can think of right now. Legoktm (talk) 04:50, 24 May 2014 (UTC)

It looks like this has been scheduled for discussion soon:

It'll be 1900 UTC http://www.timeanddate.com/worldclock/fixedtime.html?iso=20140530T1900 this Friday 30 May, in #wikimedia-office.

I'd like to see this RFC bolstered with a "Migration costs" (sub)section which would include how many people/hours it would take to do this conversion and other considerations such as the namespace constants mentioned above. You don't need to worry about who would do the migration/conversion work, just estimate about how much work in terms of developer hours are needed to do such a conversion. I guess this might vary between Wikimedia extensions (a hundred or so) and MediaWiki extensions (several thousand). There's still a bit of confusion for me in this area; for example, about whether the old system and the new system can co-exist. I think a section in the RFC for migration costs would be helpful in evaluating this.

I'd also like to see Composer addressed more directly in the RFC itself, particularly if implementing this extension registration RFC will or could impact future work in that direction.

Lastly, I think it's fairly important to figure out what the performance implications of this are and whether they can be mitigated, preferably before discussion and architecture review. My concern is that any discussion of this will immediately get blocked on a discussion of whether this is even feasible to do on huge sites, so it'd be good to get benchmarking about what this would mean for app startup(?) time and any other useful/relevant metrics. Perhaps ping Ori or Chris S. about this? --MZMcBride (talk) 13:51, 25 May 2014 (UTC)

Information conflicts with Composer
Considering we are almost definitely moving in the direction of Composer-managed extensions, it would be a good idea to not duplicate information in the composer.json file. Therefore, the "version", "url", "authors", and "autoload" keys should be removed. Since the composer.json file is also in JSON, it should be trivial to process extensions since it will be in the same folder. Parent5446 (talk) 04:47, 24 May 2014 (UTC)
 * I was under the impression that that RfC was still under discussion and no decision had been made yet. Regardless, I'd like to not have this RfC depend on that one. Aside from autoloading, all of the other keys are trivial and duplicating them really isn't a huge deal. Legoktm (talk) 07:22, 25 May 2014 (UTC)

Btw, the JS library ecosystem already has this issue. Many JS libraries that can run in the browser either support use inside node.js or use node based build tools like Grunt. So many of those projects have both a node/npm package.json and a bower.json for Bower containing the same duplicate information we're talking about. They seem to be doing fine. Daniel Friesen (Dantman) (talk) 07:40, 25 May 2014 (UTC)

As long as the registration is an opt-in and doesn't interfere with the way how Composer works with extensions, I'm more or less indifferent on whether MW needs such functionality. Personally, I can't see the benefit for moving the registration from php to json file but when you argue with "other keys are trivial and duplicating them really isn't a huge deal" where information has to be maintained in two places I find that rather questionable. Using "the same duplicate information ... They seem to be doing fine" as an argument as to why it needs to be this way is also not really satisfactory (DRY comes to mind, which might be fine for WMF supported extensions but I doubt this is entirely true for extensions provided by volunteer developers). --MWJames (talk) 14:53, 25 May 2014 (UTC)

Schema for info.json
I'd recommend writing a JSON schema for the info.json file, so that it is explicit moving forward what type of information will be in it. (It will make approving the RFC a lot easier.) Parent5446 (talk) 04:48, 24 May 2014 (UTC)

Caching of setup information
I would recommend doing some performance testing on this and see what impact this has for a large number of extensions. It may be worthwhile to "cache" the extension information by making a PHP file containing equivalent code. (In other words, have MediaWiki core take all the info.json files and dynamically generate a PHP file that would perform the equivalent configuration.)

Since the cache file will not change often, it can be held in the opcache and will be the equivalent performance of the current situation (maybe even slightly faster since it's including one file rather than one per extension.) Parent5446 (talk) 04:53, 24 May 2014 (UTC)

Some thoughts

 * info.json? That doesn't sound right, too generic and indescriptive. Why not something like extension.json?
 * What about when extensions aren't in $IP/extensions/? We allow extensions to be included from anywhere and we'll probably want to move even further in this direction by fixing extension assets to be per-path.
 * We still support PHP 5.3, "Add Extension name to list" -> "Set config var in LocalSettings.php" -> "Actually load the extension data containing global var info" looks like a register_globals vulnerability.
 * Where is description/descriptionmsg? Also it would be nice if we could physically code in a "If descriptionmsg is used the i18n message must' be defined within the extension itself and it must be using JSON i18n, not .php files". That requirement would allow a web based extension manager to display information about an extension without actually loading the extension or running any PHP from it.
 * The extension type is also missing.
 * The MassMessage/ in the module's remoteExtPath is redundant and also means the extension dirname is still restricted. Since this is now part of a context sensitive file it would be nice if the path settings were implicit/optional and the base path to the extension was automatically handled.
 * There should be a way to declare the minimum version of MediaWiki needed.

When I played around with ideas like these I went for something along the lines of this instead:

Of course I never picked specific names, MWExtension, load, loadFromPath, etc... are kind of just pulled out of a hat. But the method call meant the config load order and not being able to specify path issue never showed up. Daniel Friesen (Dantman) (talk) 07:54, 24 May 2014 (UTC)


 * I don't really care about the name, I just picked something simple. extension.json would also work.
 * Part of this RfC would be making that a requirement (it's mentioned in the "Loading extensions" section).
 * Sigh. It might be. I'll look into it more.
 * I didn't include all the keys, but this would also be in the JSON file. Given that all extensions were already migrated to JSON messages, I'd expect that if they support this new method of loading, they would also have converted their messages.
 * Same as above.
 * You're right, it is. I just copied that part from the PHP file, but I think we can simplify that now.
 * I didn't really want to address dependency management in this RfC, but it should be doable.


 * The method call works if you're using a LocalSettings.php, but what if your settings are stored in a database? How would you specify the call then? I think just requiring extensions to be in one place simplifies a lot of things - the web extension manager shouldn't need to look through a bunch of different directories to find extensions, they should just be in one place. Legoktm (talk) 07:41, 25 May 2014 (UTC)


 * The examples I gave were just for when extensions "are" installed using LocalSettings.php. The assets path for an extension is managed by the extension loading system in it so the assetsPath there is not an extension setting but a statement to the extension loading system when an extension is loaded from a custom path that doesn't have an implied asset path. If you activated extensions using an extension manager (web or cli) neither the settings nor the method call would be in LocalSettings.php.
 * I don't like the idea of requiring all extensions to always be in $IP/extensions/. The web extension manager isn't relevant – making it only install into and manage stuff in $IP/extensions/ is fine. But we have an assortment of wiki farms of various sizes and complex installations. These will stick with LocalSettings.php and we shouldn't force them to load things from $IP/extensions/ if they have a reason to load them from somewhere else. This requirement may also run counter to the attempts we've been trying to make to support package based installations of MediaWiki a bit better.
 * Daniel Friesen (Dantman) (talk) 09:08, 25 May 2014 (UTC)

I think that returning a PHP array would be more appropriate. Dan-nl (talk) 04:07, 28 May 2014 (UTC)
 * File name:
 * File path:
 * I like the concept of keeping all extensions within the  directory.
 * Loading extensions:
 * the load method would rely on the /Extensions/ path in order to pickup the extension.config.php
 * Set site specific options:
 * the load method would rely on the /Extensions/ path in order to pickup the extension.config.php
 * Set site specific options:


 * We're trying to get away from PHP in some places, not replace one PHP file with another. I'm not sure about everything the RFC wants to add but there is a valid use for moving to json for various bits of this metadata, web based (or even cli based) extension management. Moving things like names, descriptions, and the i18n descriptions depend on to JSON means that an extension manager can read this information without executing – potentially dangerous or even broken – PHP and and display that information to a user that wants to activate the extension plus this information can even be extracted by something that distributes extensions to allow extensions to be fetched and installed.
 * i agree that for extension metadata that might be used for package management, JSON seems like a good choice. i don’t understand yet why extension configuration such as setting up hooks, jobs, etc would be better managed as a JSON -- as far as i know this information will need to be placed into a PHP array eventually, so why not provide it as an array to begin with?
 * a JSON can also contain malicious code
 * information provided as JSON can also be broken
 * Dan-nl (talk) 08:46, 29 May 2014 (UTC)
 * JSON can only be malicious if it refers to something in PHP and you willingly run it, there are no side effects to simply reading the metadata.
 * If you want to know what the description set in a PHP file is you have to execute the PHP and then read the description it set. In the meantime this PHP can do absolutely anything it wants: Write to the filesystem, modify the database, run shell commands, send spam emails, or even inject a persistent virus/malware like piece of code into core. But if it's JSON you just parse the JSON and read the description, nothing happens that you didn't decide to do.
 * Likewise broken PHP and broken JSON are two different things. Broken PHP you can get a fatal PHP error or a partially complete corrupt run of the code. Broken JSON you get a simple syntax error saying the JSON can't be parsed and you tell the user the extension is invalid.
 * Some might like the cleaner format of .json over PHP but the point of JSON in the malice and broken extension topic is not about the security of installed extensions, it's about being able to read data for extensions that aren't installed, and maybe haven't even been downloaded to the server.
 * While most of my points apply to things like the name and description, we do list information like hooks used, permissions created, settings defined, etc... on our Extension pages, being in JSON would allow this information to be automatically reported.
 * Daniel Friesen (Dantman) (talk) 17:39, 29 May 2014 (UTC)


 * /extensions/ not /Extensions/
 * Even if you like putting extensions in $IP/extensions/ it is not something we should be hard requiring. Doing so precludes various people other than you who do have a valid reason for installing extensions in another directory. There are a bunch of potential reasons, for example, trying to avoid quirks when using an unmodified git checkout of core to manage MediaWiki; packaging Extensions up so they can be installed as packages into a package based version of MW; and for developers working on a checkout of mediawiki/core.git while using a separate checkout of mediawiki/extensions.git to manage the extension tree.
 * i think you’re right, we should provide flexibility of location, but i believe the default should be set as /extensions/ with the option of providing an alternative, if necessary, in a config or param as you show above.
 * i find placing extensions in directories other than /extensions/ is non-intuitive and we should stick with the default as much as possible.
 * i also think it would be good to hash out the other scenarios you mention and list out solutions tried that didn’t work. maybe, one day someone might overcome the hurdle and a different directory won’t be necessary.
 * Dan-nl (talk) 08:46, 29 May 2014 (UTC)
 * The scenarios aren't something that will ever just go away or be "overcome". Nesting an unrelated git repo inside of another git repo (or even adding symbolic links) while developing them will always have side effects or an undesired cd hierarchy that will have developers checking out core and extensions into different directories. And there will always be package repositories that want to package extension files in a different directory than the one that happens to be $IP/extensions/. Daniel Friesen (Dantman) (talk) 17:39, 29 May 2014 (UTC)


 * Going from  to   sounds like a step back, not forwards, it's even harder to write into LocalSettings.php.
 * i agree. i like the extension class idea you mention above and prefer the concept of  where the array is an option array.
 * Dan-nl (talk) 08:46, 29 May 2014 (UTC)


 * A method of loading site specific options like that isn't going to cut it. There are a bunch of different ways of doing config for wiki farms, and ['site-options'] would only work in a few cases. A few methods that I can think of:
 * Small wiki farms setting config based on a few conditionals in LocalSettings.php.
 * Mid-sized farms with a bunch of config in settings files, potentially broken into multiple files and perhaps using something like $wgConfig.
 * Massive dynamic wiki farms, storing wiki config in the database in a custom format for potentially hundreds, thousands, or more wikis.
 * it would be good to see detailed scenarios in order to understand them better.
 * how would a JSON config work in those scenarios that need logic in order to alter the config or to be spread across several files?
 * wouldn’t a PHP config array be easier to alter and spread out across several files?
 * a PHP config array could be stored in a db just as easily as JSON.
 * Dan-nl (talk) 08:46, 29 May 2014 (UTC)
 * Are you perhaps misunderstanding the intent of the "config" object in the example .json file? The config inside the .json example only defines the default values of configuration variables, in place of the  variables currently being set in the extension's PHP file. The RFC doesn't change the way extension config is stored at all. Daniel Friesen (Dantman) (talk) 17:39, 29 May 2014 (UTC)

Daniel Friesen (Dantman) (talk) 07:07, 28 May 2014 (UTC)

Maybe it would be good to breakdown the RFC into further sections: extension metadata extension configuration extension location Dan-nl (talk) 08:46, 29 May 2014 (UTC)
 * what is it?
 * what format should it be in?
 * how to load it?
 * file name?
 * what is it?
 * what format should it be in?
 * how to load it?
 * file name?
 * default location?
 * provide alternative or not?

Extenson points
I think it is fine to have info such as extension name and resource modules in a json file. However having extension points, such as hooks, special pages, etc, in there is problematic. It forces static context and is very unfriendly towards code analysis tools. --Jeroen De Dauw (talk) 15:57, 27 May 2014 (UTC)

Namespace RFC related
See also Requests for comment/Associated namespaces which has a "namespace registry" component which might alleviate some of the concerns over namespaces...or complicate them. --MarkTraceur (talk) 15:59, 27 May 2014 (UTC)