Talk:Requests for comment/Extension registration

From mediawiki.org
Latest comment: 9 years ago by Sharihareswara (WMF) in topic current status

Drawbacks[edit]

Besides migration costs, what are the (potential) drawbacks of implementing this proposal? --MZMcBride (talk) 04:38, 24 May 2014 (UTC)Reply

  • Parsing JSON is probably slower than loading a PHP file. We can mitigate this by caching the parsed result.
  • You can't use things like namespace constants in a JSON file. While this doesn't actually break anything (we can just use the actual number), it will hinder readability.
There are probably more, but that's what I can think of right now. Legoktm (talk) 04:50, 24 May 2014 (UTC)Reply

It looks like this has been scheduled for discussion soon:


It'll be 1900 UTC http://www.timeanddate.com/worldclock/fixedtime.html?iso=20140530T1900 this Friday 30 May, in #wikimedia-office.


I'd like to see this RFC bolstered with a "Migration costs" (sub)section which would include how many people/hours it would take to do this conversion and other considerations such as the namespace constants mentioned above. You don't need to worry about who would do the migration/conversion work, just estimate about how much work in terms of developer hours are needed to do such a conversion. I guess this might vary between Wikimedia extensions (a hundred or so) and MediaWiki extensions (several thousand). There's still a bit of confusion for me in this area; for example, about whether the old system and the new system can co-exist. I think a section in the RFC for migration costs would be helpful in evaluating this.

I'd also like to see Composer addressed more directly in the RFC itself, particularly if implementing this extension registration RFC will or could impact future work in that direction.

Lastly, I think it's fairly important to figure out what the performance implications of this are and whether they can be mitigated, preferably before discussion and architecture review. My concern is that any discussion of this will immediately get blocked on a discussion of whether this is even feasible to do on huge sites, so it'd be good to get benchmarking about what this would mean for app startup(?) time and any other useful/relevant metrics. Perhaps ping Ori or Chris S. about this? --MZMcBride (talk) 13:51, 25 May 2014 (UTC)Reply

Information conflicts with Composer[edit]

Considering we are almost definitely moving in the direction of Composer-managed extensions, it would be a good idea to not duplicate information in the composer.json file. Therefore, the "version", "url", "authors", and "autoload" keys should be removed. Since the composer.json file is also in JSON, it should be trivial to process extensions since it will be in the same folder. Parent5446 (talk) 04:47, 24 May 2014 (UTC)Reply

I was under the impression that that RfC was still under discussion and no decision had been made yet. Regardless, I'd like to not have this RfC depend on that one. Aside from autoloading, all of the other keys are trivial and duplicating them really isn't a huge deal. Legoktm (talk) 07:22, 25 May 2014 (UTC)Reply

Btw, the JS library ecosystem already has this issue. Many JS libraries that can run in the browser either support use inside node.js or use node based build tools like Grunt. So many of those projects have both a node/npm package.json and a bower.json for Bower containing the same duplicate information we're talking about. They seem to be doing fine. Daniel Friesen (Dantman) (talk) 07:40, 25 May 2014 (UTC)Reply

As long as the registration is an opt-in and doesn't interfere with the way how Composer works with extensions, I'm more or less indifferent on whether MW needs such functionality. Personally, I can't see the benefit for moving the registration from php to json file but when you argue with "other keys are trivial and duplicating them really isn't a huge deal" where information has to be maintained in two places I find that rather questionable. Using "the same duplicate information ... They seem to be doing fine" as an argument as to why it needs to be this way is also not really satisfactory (DRY comes to mind, which might be fine for WMF supported extensions but I doubt this is entirely true for extensions provided by volunteer developers). --MWJames (talk) 14:53, 25 May 2014 (UTC)Reply

Well, I mean, how often do you change your extension's url? Or it's name? I don't really see what WMF/volunteers has to do with it. But yes, you can consider the new registration process to be "opt-in" (it's fully backwards-compatible) and AFAICT it shouldn't affect composer at all. Legoktm (talk) 08:42, 4 June 2014 (UTC)Reply

Schema for info.json[edit]

I'd recommend writing a JSON schema for the info.json file, so that it is explicit moving forward what type of information will be in it. (It will make approving the RFC a lot easier.) Parent5446 (talk) 04:48, 24 May 2014 (UTC)Reply

Parent5446, I created a draft at Requests for comment/Extension registration/Schema. How does it look? (I've never created a JSON schema before) Legoktm (talk) 17:47, 2 June 2014 (UTC)Reply
It's a good start. Will need to be refined a lot more before it is final, of course. Parent5446 (talk) 10:49, 5 June 2014 (UTC)Reply

Caching of setup information[edit]

I would recommend doing some performance testing on this and see what impact this has for a large number of extensions. It may be worthwhile to "cache" the extension information by making a PHP file containing equivalent code. (In other words, have MediaWiki core take all the info.json files and dynamically generate a PHP file that would perform the equivalent configuration.)

Since the cache file will not change often, it can be held in the opcache and will be the equivalent performance of the current situation (maybe even slightly faster since it's including one file rather than one per extension.) Parent5446 (talk) 04:53, 24 May 2014 (UTC)Reply

Some thoughts[edit]

  • info.json? That doesn't sound right, too generic and indescriptive. Why not something like extension.json?
  • What about when extensions aren't in $IP/extensions/? We allow extensions to be included from anywhere and we'll probably want to move even further in this direction by fixing extension assets to be per-path.
  • We still support PHP 5.3, "Add Extension name to list" -> "Set config var in LocalSettings.php" -> "Actually load the extension data containing global var info" looks like a register_globals vulnerability.
  • Where is description/descriptionmsg? Also it would be nice if we could physically code in a "If descriptionmsg is used the i18n message must' be defined within the extension itself and it must be using JSON i18n, not .php files". That requirement would allow a web based extension manager to display information about an extension without actually loading the extension or running any PHP from it.
  • The extension type is also missing.
  • The MassMessage/ in the module's remoteExtPath is redundant and also means the extension dirname is still restricted. Since this is now part of a context sensitive file it would be nice if the path settings were implicit/optional and the base path to the extension was automatically handled.
  • There should be a way to declare the minimum version of MediaWiki needed.

When I played around with ideas like these I went for something along the lines of this instead:

MWExtension::load( 'MassMessage' );
MWExtension::loadFromPath( '/path/to/MassMessage', array( 'assetsPath' => '//sub.example.com/extensions/MassMessage' ) );

Of course I never picked specific names, MWExtension, load, loadFromPath, etc... are kind of just pulled out of a hat. But the method call meant the config load order and not being able to specify path issue never showed up. Daniel Friesen (Dantman) (talk) 07:54, 24 May 2014 (UTC)Reply

  • I don't really care about the name, I just picked something simple. extension.json would also work.
  • Part of this RfC would be making that a requirement (it's mentioned in the "Loading extensions" section).
  • Sigh. It might be. I'll look into it more.
  • I didn't include all the keys, but this would also be in the JSON file. Given that all extensions were already migrated to JSON messages, I'd expect that if they support this new method of loading, they would also have converted their messages.
  • Same as above.
  • You're right, it is. I just copied that part from the PHP file, but I think we can simplify that now.
  • I didn't really want to address dependency management in this RfC, but it should be doable.
The method call works if you're using a LocalSettings.php, but what if your settings are stored in a database? How would you specify the call then? I think just requiring extensions to be in one place simplifies a lot of things - the web extension manager shouldn't need to look through a bunch of different directories to find extensions, they should just be in one place. Legoktm (talk) 07:41, 25 May 2014 (UTC)Reply
The examples I gave were just for when extensions "are" installed using LocalSettings.php. The assets path for an extension is managed by the extension loading system in it so the assetsPath there is not an extension setting but a statement to the extension loading system when an extension is loaded from a custom path that doesn't have an implied asset path. If you activated extensions using an extension manager (web or cli) neither the settings nor the method call would be in LocalSettings.php.
I don't like the idea of requiring all extensions to always be in $IP/extensions/. The web extension manager isn't relevant – making it only install into and manage stuff in $IP/extensions/ is fine. But we have an assortment of wiki farms of various sizes and complex installations. These will stick with LocalSettings.php and we shouldn't force them to load things from $IP/extensions/ if they have a reason to load them from somewhere else. This requirement may also run counter to the attempts we've been trying to make to support package based installations of MediaWiki a bit better.
Daniel Friesen (Dantman) (talk) 09:08, 25 May 2014 (UTC)Reply

I think that returning a PHP array would be more appropriate.

  • File name:
    • extension.config.php
  • File path:
    • I like the concept of keeping all extensions within the /Extensions/ directory.
  • Loading extensions:
    • $wgEnabledExtensions['ThisExtension']['config'] = include $IP . '/Extensions/ThisExtension/extension.config.php';
    • MWExtension::load( 'ThisExtension' ); the load method would rely on the /Extensions/ path in order to pickup the extension.config.php
  • Set site specific options:
    • $wgEnabledExtensions['ThisExtension']['site-options'] = array();
    • MWExtension::load( 'ThisExtension', array() );

Dan-nl (talk) 04:07, 28 May 2014 (UTC)Reply

  • We're trying to get away from PHP in some places, not replace one PHP file with another. I'm not sure about everything the RFC wants to add but there is a valid use for moving to json for various bits of this metadata, web based (or even cli based) extension management. Moving things like names, descriptions, and the i18n descriptions depend on to JSON means that an extension manager can read this information without executing – potentially dangerous or even broken – PHP and and display that information to a user that wants to activate the extension plus this information can even be extracted by something that distributes extensions to allow extensions to be fetched and installed.
  • i agree that for extension metadata that might be used for package management, JSON seems like a good choice. i don’t understand yet why extension configuration such as setting up hooks, jobs, etc would be better managed as a JSON -- as far as i know this information will need to be placed into a PHP array eventually, so why not provide it as an array to begin with?
    • a JSON can also contain malicious code
    • information provided as JSON can also be broken
  • Dan-nl (talk) 08:46, 29 May 2014 (UTC)Reply
  • JSON can only be malicious if it refers to something in PHP and you willingly run it, there are no side effects to simply reading the metadata.
  • If you want to know what the description set in a PHP file is you have to execute the PHP and then read the description it set. In the meantime this PHP can do absolutely anything it wants: Write to the filesystem, modify the database, run shell commands, send spam emails, or even inject a persistent virus/malware like piece of code into core. But if it's JSON you just parse the JSON and read the description, nothing happens that you didn't decide to do.
  • Likewise broken PHP and broken JSON are two different things. Broken PHP you can get a fatal PHP error or a partially complete corrupt run of the code. Broken JSON you get a simple syntax error saying the JSON can't be parsed and you tell the user the extension is invalid.
  • Some might like the cleaner format of .json over PHP but the point of JSON in the malice and broken extension topic is not about the security of installed extensions, it's about being able to read data for extensions that aren't installed, and maybe haven't even been downloaded to the server.
  • While most of my points apply to things like the name and description, we do list information like hooks used, permissions created, settings defined, etc... on our Extension pages, being in JSON would allow this information to be automatically reported.
Daniel Friesen (Dantman) (talk) 17:39, 29 May 2014 (UTC)Reply
  • /extensions/ not /Extensions/
  • Even if you like putting extensions in $IP/extensions/ it is not something we should be hard requiring. Doing so precludes various people other than you who do have a valid reason for installing extensions in another directory. There are a bunch of potential reasons, for example, trying to avoid quirks when using an unmodified git checkout of core to manage MediaWiki; packaging Extensions up so they can be installed as packages into a package based version of MW; and for developers working on a checkout of mediawiki/core.git while using a separate checkout of mediawiki/extensions.git to manage the extension tree.
  • i think you’re right, we should provide flexibility of location, but i believe the default should be set as /extensions/ with the option of providing an alternative, if necessary, in a config or param as you show above.
  • i find placing extensions in directories other than /extensions/ is non-intuitive and we should stick with the default as much as possible.
  • i also think it would be good to hash out the other scenarios you mention and list out solutions tried that didn’t work. maybe, one day someone might overcome the hurdle and a different directory won’t be necessary.
  • Dan-nl (talk) 08:46, 29 May 2014 (UTC)Reply
  • The scenarios aren't something that will ever just go away or be "overcome". Nesting an unrelated git repo inside of another git repo (or even adding symbolic links) while developing them will always have side effects or an undesired cd hierarchy that will have developers checking out core and extensions into different directories. And there will always be package repositories that want to package extension files in a different directory than the one that happens to be $IP/extensions/. Daniel Friesen (Dantman) (talk) 17:39, 29 May 2014 (UTC)Reply
  • Going from require_once( "$IP/extensions/ThisExtension/ThisExtension.php" ); to $wgEnabledExtensions['ThisExtension']['config'] = require( "$IP/extensions/ThisExtension/extension.config.php" ); sounds like a step back, not forwards, it's even harder to write into LocalSettings.php.
  • i agree. i like the extension class idea you mention above and prefer the concept of MWExtension::load( 'ThisExtension', array() ); where the array is an option array.
  • Dan-nl (talk) 08:46, 29 May 2014 (UTC)Reply
  • A method of loading site specific options like that isn't going to cut it. There are a bunch of different ways of doing config for wiki farms, and ['site-options'] would only work in a few cases. A few methods that I can think of:
    • Small wiki farms setting config based on a few conditionals in LocalSettings.php.
    • Mid-sized farms with a bunch of config in settings files, potentially broken into multiple files and perhaps using something like $wgConfig.
    • Massive dynamic wiki farms, storing wiki config in the database in a custom format for potentially hundreds, thousands, or more wikis.
  • it would be good to see detailed scenarios in order to understand them better.
  • how would a JSON config work in those scenarios that need logic in order to alter the config or to be spread across several files?
  • wouldn’t a PHP config array be easier to alter and spread out across several files?
  • Are you perhaps misunderstanding the intent of the "config" object in the example .json file? The config inside the .json example only defines the default values of configuration variables, in place of the $wg... = ...; variables currently being set in the extension's PHP file. The RFC doesn't change the way extension config is stored at all. Daniel Friesen (Dantman) (talk) 17:39, 29 May 2014 (UTC)Reply

Daniel Friesen (Dantman) (talk) 07:07, 28 May 2014 (UTC)Reply

Maybe it would be good to breakdown the RFC into further sections:
extension metadata

  • what is it?
  • what format should it be in?
  • how to load it?
  • file name?

extension configuration

  • what is it?
  • what format should it be in?
  • how to load it?
  • file name?

extension location

  • default location?
  • provide alternative or not?

Dan-nl (talk) 08:46, 29 May 2014 (UTC)Reply

Extension points[edit]

I think it is fine to have info such as extension name and resource modules in a json file. However having extension points, such as hooks, special pages, etc, in there is problematic. It forces static context and is very unfriendly towards code analysis tools. --Jeroen De Dauw (talk) 15:57, 27 May 2014 (UTC)Reply

Learning from wordpress plugins[edit]

The performance "cost" and potential bugs of an extension are much much higher then the negative impact of few global variables. some of the benefits of wordpress attitude toward plugins are versions compatibility maps between plugin version and mediawiki version. end users feedback and star rating. separation between installing a plugin and enabling it, without os shell access, and more. that eco system seems more mature and a good reference to learn from and adapt whats relevant, worth spending there some time. just dealing with registration and global variables might be a too little step, if we are making a change lets make it a bit more ambitious. a very clear and mandatory version numbering guidelines would be great at least for new plugins. http://wordpress.org/plugins/

It's not really the negative impact of global variables. It's that we want to move away from global variables so we can store configuration in a database, giving us the ability to enable/disable extensions from a simple webui. Adding things like ratings and version numbers can easily be done outside of this RfC, and aren't really related. Part of this change is to avoid breaking any backwards-compatability, so I'm not sure why we'd want to be more ambitious and intentionally break stuff. (At least in the initial migration.) Legoktm (talk) 08:34, 4 June 2014 (UTC)Reply

Namespace RFC related[edit]

See also Requests for comment/Associated namespaces which has a "namespace registry" component which might alleviate some of the concerns over namespaces...or complicate them. --MarkTraceur (talk) 15:59, 27 May 2014 (UTC)Reply

Database change support (also update.php?)[edit]

As some extensions have database changes (ex: Extension:TitleKey or Extension:AbuseFilter), Database schema update is described in each extension's hooks.php by inheritance maintenance scripts. I think we can manage in json file.

See also: Manual:Hooks/LoadExtensionSchemaUpdates

--devunt (talk) 07:05, 1 June 2014 (UTC)Reply

Sounds like a reasonable idea. For now though, I think we should keep it as a normal hook handler, and be open to in the future turning it into a property in the JSON file. Legoktm (talk) 08:37, 4 June 2014 (UTC)Reply

IRC meeting[edit]

Logs from the meeting are at Architecture meetings/RFC review 2014-05-30. If you want a summary, I sent an email to wikitech-l about it, and what needs to be discussed further. Legoktm (talk) 08:32, 4 June 2014 (UTC)Reply

Resource loader modules in JSON[edit]

For Wikibase having to register all resource loader modules in one JSON file would not work. We currently have one modules definition file per directory, and there are a lot. --Adrian Lang (WMDE) (talk) 08:46, 4 June 2014 (UTC)Reply

I think this is valid concern, we don't want to end up with a gigantic json file that becomes annoying to edit (like the old Ext.i18n.php files). We could have something in extension.json that would let you point to other *.json files that need to be read as well? Legoktm (talk) 06:38, 2 August 2014 (UTC)Reply

current status[edit]

Kunal just said in IRC:

I have a few things I need to fix up in the extension registration RfC after discussing them with ^d last week, it would be good to have an IRC hour (or half) for it after that

Sharihareswara (WMF) (talk) 21:27, 12 August 2014 (UTC)Reply