Extension:UnitsFormatter

What can this extension do?
This extension:


 * allows article authors to use a simple markup to enter quantities. Quantities can be entered in a single unit; authors don't need to worry about conversions.
 * allows users to select what style of units these quantities are displayed in. Users can select between metric and non-metric units, or both; for non-metric, they can select between British and US units.

For example, this text talks about fuel consumption, with units formatted as per Wikipedia standards: "" This rather fiddly markup produces (unconditionally):
 * My car burns 7.2 U.S. gallons (27.3 l) of gas every 100 kilometres (62 mi).

Using the UnitsFormatter extension, we can use this simpler input, which doesn't require the author to do any conversions: "" (The " " after " " in this example fixes km as the primary unit for the distance.) This markup can produce any of the following outputs, or others, depending on user preferences:
 * My car burns 7.2 US gallons (27.3 l) of gas every 100 kilometres (62 mi).
 * My car burns 6.0 British gallons of gas every 100 kilometres.
 * My car burns 27.3 litres (6.0 Br gal) of gas every 100 kilometres (62 mi).

All the required markup, including non-breaking spaces and wikilinks, is generated. Furthermore, the displayed unit names and definition links are configurable in the MediaWiki: namespace. So for US-centric wikis, without changing all of the articles which use these units, the above can become:
 * My car burns 7.2 gallons (27.3 l) of gas every 100 kilometres (62 mi).

What it can't do
This is not a unit converter. If that's what you want, do it some other way. For example, this apparently obvious markup: "" will, depending on user preferences, produce something like:
 * 0.8 British gallons equals 0.832674 British gallons

Here, the user has elected to see units in British units, so both the inputs are converted to British gallons. The values are different because the inputs had different numbers of decimals. Not very useful! Just drop the tags, and simply write: ""

Wiki users
The wiki users select the style of units they prefer to see in the "preferences" screen. Each  tag in any article is then formatted for that user according to his/her preferences.

Currently, the GUI for user configuration is very crude; this is because an extension can only add simple checkboxes to the user's preferences screens. In future, this could be improved with a preferences tab dedicated to units.

The following checkboxes are added to the "Misc" preferences tab:


 * Prefer the article's units: if checked, the units stated in the article are used as the primary units.
 * Prefer metric units: if checked, the primary units are metric; otherwise, non-metric.
 * For non-metric, prefer British over US units: if checked, uses British instead of U.S. gallons, etc.
 * Show both metric and non-metric units: if checked, shows dual units; like "3 meters (10 ft)".

Article authors
For article authors, it's just a case of using the  tag to represent units. This tag has the following syntax:

   value   unit   

The parameters are:
 *  : the numerical value of the quantity to be displayed. Must be in decimal notation, eg..
 *  : the ID of the unit; eg.,  .  Must be one of the unit identifiers listed in the units table.  Appending a " " character to the unit name causes that unit to be used as the primary unit in the output regardless of user preferences; if the user has selected dual units, they will still be displayed.
 *  : (optional) The number of decimal digits to display; defaults to the number of decimals in  . The number of digits is interpreted with regard to the units supplied as input, and is additionally scaled appropriately for the conversion in force; see "Rounding and scaling" below.
 *  : (optional) The preferred realm for this conversion; this is a hint that, for example, we are talking about nautical as opposed to land measurements. See "Realms" below.
 *  : (optional) An alternate unit to display as well as the user-selected units. For example, specify   to ensure that a temperature is displayed in Kelvin as well as Celcius and/or Fahrenheit.

Examples:

Article authors can, of course, state measures in any supported units. However, it is recommended that the most natural units for the subject be used — for example, an article about U.S. vehicle standards should use U.S. gallons. Thus, users who use the "prefer the article's units" mode will see units relevant to the subject first, followed by other units which that user has selected.

Rounding and scaling
Sometimes a unit conversion has the potential to result in loss of information due to mismatched unit magnitudes. For example,, if converted with no regard for precision, would produce the output "0 inches (3 mm)" — with no decimal places, the result in inches is zero.

To avoid this, we calculate the order-of-magnitude change involved in a conversion, and adjust the number of decimals upwards or downwards accordingly; likewise, we round digits before the point if appropriate. So, for example:
 * 3mm converts to 0.1 inches, not 0 inches (avoiding loss of precision)
 * 1.000 acre converts to 4,046.9 m² (if you force conversion to m²), not 4,046.856 m² (avoiding spurious precision).
 * 1.000 square mile converts to 2,590,000 m², not 2,589,988 m² (avoiding spurious precision).

The   parameter to the  tag is interpreted with respect to the input units, and is adjusted for converted units. Hence,  will display in metric as "1.219 m".

Article authors must ensure that the results are useful in all units; hence, it is strongly recommended that authors set the "Show both metric and non-metric units" user preference (see above).

Realms
Some units are associated with particular realms of interest. For example, miles per hour and knots are both roughly equivalent non-metric units of speed; but knots are the ones most commonly used in nautical topics.

Certain units are tagged with their realm of interest, as shown in the units table. Currently, we have the following realms:
 * : subject is nautical; use customary nautical units where appropriate.
 * : the measure relates to quantities of oil (as in crude oil trading); use oil units where appropriate.

An article author can specify the realm of interest using the   parameter; if units with the indicated realm exist, they will take precedence. If no realm is specified, and the input units have an associated realm, that realm will be used — this makes life even easier if the input units are already the preferred ones for the subject. Note that this only selects between units of a given locale; none of this overrides the user's preferences of which locale to display.

Some examples will clarify this. For these examples, the user's preferences have selected metric as the primary units, and non-metric as secondary:

Time
Some units, such as for time, don't have metric/Imperial alternatives; the unit of time is a second in all systems. (A minute is an alternative measure of time, but not in terms of metric v. Imperial.) Hence,   simply displays as "5.00 seconds". The only benefit of using  here is that it will generate the right markup, in terms of inserting a non-breaking space, and making the wikilink to the unit definition the first time it is used.

Units
A table of known units can be produced with the  tag, which is just used thus:

This displays the configuration data, including Wiki messages, for all units.

The table of units currently supported can be viewed at Extension:UnitsFormatter/Unit table.

The following wiki system messages are used. These are given default values, but can be overridden by editing.

These messages configure the preferences GUI:


 * tog-preferinput: the label for the 'Prefer the article's units' preferences checkbox.
 * tog-prefermetric: the label for the 'Prefer metric units' preferences checkbox.
 * tog-preferbritish: the label for the 'Prefer British over US units' preferences checkbox.
 * tog-dualunits: the label for the 'Show both metric and non-metric' preferences checkbox.

This message changes the working of links to unit definition articles:


 * unit-link-prefix: the interwiki prefix which is prepended to all links. This can be left blank, in which case the link "metre", for example, will attempt to link to the article metre .  Most wikis will set this to their interwiki link to get to Wikipedia, eg. " ", in which case the link for "metre" becomes metre.

Unit Messages
The names and abbreviations displayed for units are configured using wiki system messages. Each unit U has four messages associated with it:


 * unit-U-name: the singular name of the unit.
 * unit-U-names: the plural name of the unit.
 * unit-U-ab: the abbreviation for the unit.
 * unit-U-link: a wikilink to a page providing a definition of the unit.

These messages can be configured to taste, by navigating to. For example, for the unit "ft", we have:

The link is just the Wiki page name. The content of the system message 'unit-link-prefix' (see above) will be prepended to this; eg. " ".

Unit Data
The units themselves are configured in the code, in the  table; the unit names are set up in the   table. Take care to keep both tables in step. The comments in the code explain the formats of the tables.

Other languages
I (JohanTheGhost) apologise for making this extension available only in English. However, adding other languages should be pretty easy.


 * The  table sets up the unit names, abbreviations, and wikilinks; it has one sub-table per language, so just add new sub-tables for new languages.
 * The  table sets up other messages used by the extension; again, just add new sub-tables for new languages.

Compound units
Compound units (eg. metres/second) are currently configured as individual fixed units -- ie. there is no support for combining units in a general way. This has been prototyped to some extent, but it gets very complex, and we end up with huge amounts of code to just format units.

Also, it raises new problems of unit equivalence. For example, metres naturally convert to yards; but metres/second should convert to feet/second, which is by far the more common non-metric measure of speeds. There's really no way that a generalised compound unit algorithm could know this, unless rules are programmed for specific unit combinations; and if we're going to do that, we may as well just program the compound units as simple units, which is what we've done.

Another problem would be with unit names. A nautical speed would need to be entered as 'nm/hr', in order to convert to 'km/hr'; this would display as 'nm/hr', or 'nautical mile per hour', where 'kt' or 'knot' is far more appropriate. Again, individual rules could be coded for this, but again we may as well just enter the compound units explicitly, as we have done.

Cacheing
This extension diminishes the effectiveness of the page cache. To allow different users to see pages with different presentations according to user preferences, pages are cached by name and by a hash of the user's preferences. This extension adds preferences, and hence will increase the number of different versions of each page in the cache, thus requiring more pages to be re-rendered from source.

The effect of this could be reduced by reducing the number of user options.

User preferences GUI
As mentioned above, the preferences GUI is very crude. A better GUI would use a dedicated "Units" tab with drop-down lists to select primary and secondary units, or something.

Right now there is no hook to allow this kind of GUI; so, for now, we stick with the simple checkboxes. If this extension becomes more mainstream, a better GUI should be devised, probably supported by a new hook in the preferences screen.