User:Roan Kattouw (WMF)/ResourceLoader submodules

This is an idea for how we could support submodules for large modules that expose many small things with dependencies between them. This is commonly the case for libraries (like OOUI). Right now, our only option is to subdivide these modules into smaller modules, but that also increases the size of the startup module by increasing the number of modules. We'd like to be able to do fine-grained tree-shaking of large libraries, but in the current system that would require creating too many modules.

The basic idea of this proposal is as follows:


 * Let modules depend on individual files from another module (if the depended-on module is a package module)
 * Allow files in package modules to express dependencies on each other, and on other modules
 * Simplify/consolidate this information in the manifest, so that only module-level information is exposed to the client
 * The client doesn't know exactly what parts of modules it's asking for, but it passes enough information so that the server can figure it out

Per-file dependencies for package modules
Let files in package modules define dependencies for each file. These dependencies could be other files in the same module (internal), or other modules (external). This information would not be exposed directly in the module manifest in the startup module: internal dependencies are omitted completely, and external dependencies are consolidated at the module level. The module definition above expresses an internal dependency ( depends on  ) and several external dependencies. In the startup manifest, this will be simplified to say that  depends on   and.

Allow modules to depend on files from other modules
Using the  syntax, also used above for internal dependencies, modules can depend on files from other modules, and then load these using   In the startup manifest, this is simplified to say that   depends on   and. It will also say that the dependency on  is a full dependency and the dependency on   is a partial dependency, without saying exactly which file(s) it depends on.

How the client deals with partial dependencies
When the client is asked to load, it sees that it has a full dependency on   and a partial dependency on  , and that   in turn depends on   and. Assuming none of these modules have been loaded yet, the client sends a request to the server indicating it wants all of,  ,   and  , and part of. It doesn't know what part it needs, but it indicates that  needs to be loaded partially. The server can figure out which parts are needed by looking at which files within  are depended on by the modules that are being requested.

Note that there is an inefficiency here:  is requested even though we won't need it, because we aren't going to load the part of   that it depends on, but the client doesn't know that.

How the server resolves partial dependencies
The server gets a request asking for all of,  ,   and  , and part of. The server determines which parts of  are needed by looking at which files in   the fully-loaded modules depend on, then resolving internal dependencies. It finds that  depends on , which in turn depends on. It responds with the full contents of the fully-requested modules, and the partial contents of  (only   and , but not  ).

How the client manages state for partially-loaded modules
The client receives a partial response for, which is flagged as such. It makes these files available for loading with, but it doesn't mark the module as fully loaded. If, later, a module is loaded that also has a partial dependency on, the client will follow the same protocol and let the server figure out which files to send, which might duplicate files it already has. If this happens, the client will simply ignore the files in the response that it has already loaded. If, later, the full  module is asked to be loaded in its entirety (or a module is loaded that has a full dependency on  ), the client will ask the server for the entire module, and again ignore the files it already has.

Inefficiencies
This proposal has two main inefficiencies. First, and more important, when a module is loaded partially, all of its external dependencies are loaded too, even the ones that aren't needed for the files that are being loaded. This is difficult to avoid with this architecture, and it might be an issue if there are many unnecessary dependencies that are loaded this way or if the unnecessary dependencies are large. I don't think there's a good way of dealing with this other than splitting the module.

Secondly, if the same module is partially loaded twice, to satsify different dependencies in different requests, some of its files could be downloaded twice. I don't think this will be much of an issue, because this is likely to be infrequent (the same module being partially loaded twice, on separate occasions, on the same page won't happen often) and the impact is likely to be low (few files double-loaded each time). If there is a large "core" part of the module that almost all files depend on, breaking that out into a separate module could address that.

Examples for where this could be used

 * OOUI icon packs: these consist of fully independent parts (individual icons) with no internal or external dependencies. All OOUI icons could be put in one big module, with each module using them specifying which exact icons it needs
 * OOUI itself: each widget could, in principle, be exposed separately
 * mediawiki.widgets.*: There is a  module with relatively unrelated widgets, and there are 16 more   modules (and 5   modues) that contain individual widgets. These could potentially be consolidated into one omnibus   module.

Open questions

 * Should modules have to opt into letting other modules depend on their files, or should it be allowed for all package modules? If a module allows other modules to load its files, should all its files be exposed, or only a limited list of files that it specifies?
 * How do we support CSS files? We'd need this for OOUI (widgets come with styles) and for icons (which are only CSS). In the code for  file support, we do have an internal content type   that allows CSS to be bundled with a (JS) package file, but we don't allow this to be used in the module definition (yet). This may be as simple as allowing   (and  ) as a package file extension that maps to a   with an empty script part, then having JS files express internal dependencies on the CSS files they need.
 * Should we / Would we need to allow direct loading of individual files? Icon packs are often loaded directly through, the   modules are too, and in some cases we may want to consolidate many modules into a single module but still be able to load parts of it directly through  ,