Tablesorter/Improved parsers

From mediawiki.org
Jump to navigation Jump to search

In December 2015 I (MatthiasDD) started to solve task T29745, References in column affect sorting and T46818 jquery.tablesorter should sort plain year digits as date. Later I take care some other Tasks that be connected with the right detection of parser.

Now I suggest to change the parser detection of sortable tables for all types as follows:

  1. References (class='reference') removed from sort value.
  2. Text before sort value is allowed.
  3. Text after sort value is allowed.
  4. A plain number 1-4 digit can detected as number, date, or isoDate depend from other collumn content.

So "about 1870[1]" would detected as year without use of Templates.

Some aditional improvements in parser detection are described below for each parser.

Test script[edit]

For testing without change the actual tablesorter.js you can add in your Custom JavaScript:

mw.loader.load("//de.wikipedia.org/w/index.php?title=Benutzer:MatthiasDD/ts_test.js&action=raw&ctype=text/javascript");

After page loading the current parsers are used.

Ctrl + Right mouse at table header activate the actual parsers of tablesorter and change the table in my diagnosis mode.
Only    Right mouse at table header activate the new suggested parsers that described here. It change the table also in my diagnosis mode and show all sort values in that row in a message box.

Diagnosis mode
Colors the background depend from detected parser for this cell and write title tags for each cell with used parser. If detected and used parser differ: The detected parser displayed in brackets, and at second line the used parser.

Details for each parser[edit]

IPAddress[edit]

add support for IP/CIDR format, (solve phab:T36475)

IPAdress 1 IPAdress 2
45.238.27.109/32 111.255.333.444
45.238.27.109/8[2][3] 1.202.203.204
45.238.27.109 [4] 1.022.033.044
usual 204.1.132.158/24[remark 1] 1.2.3.4
a: 204.38.0.0/24 1.202.203.4

currency[edit]

Because Text before and after sort value is allowed. it's now done by parser #number. All other currency chars or text are possible. This parser can be deleted.

url[edit]

Has not worked since 2011-04-14 because the RegExp was /^(https?|ftp|file):\/\/$/. The $ means the input must end with ://, this is never the end of a url. I have changed this, but i would say: This parser can be deleted. See T47161 Kill all non-trivial parsers in $.tablesorter

isoDate[edit]

Effektive since
MediaWiki_1.30/wmf.14
Gerrit change 287449
  • Time without Z was parsed as local time, that was false and is now UTC time.
  • Years 0...99 solved.
  • short forms are possible: JJJJ, JJJJ-MM, and only with data-sort-type="isoDate": JJJJMM, JJJMMTT

set in header means that the parser is defined with data-sort-type="isoDate".

isoDate 1 set isoDate 2 isoDate 3 isoDate 4
71-01 [5] 197001 0007-07 2016-05-01 15:45:12.1
1970-01-23 03:20Z[6] 19700123 0320Z -8-08-08 2016-05-01 15:45:12.011
1970-01-23T03:20+05:00[7] 19700123T0320-0500 +9999-12 2016-05-01 15:45:12.11
1970[8] 1970 -9999 2000-01-01T12:30:30-23:59
1970-01-23 03:20:00,111[9] 19700123 03:20:00.111 +60-06T10:00:00-02 2000-01-01T12:30:30+23:59

usLongDate[edit]

Need we this parser really? In my opinion this parser should removed later.

date[edit]

  • RegExp inside '[]' (Ecma ClassAtomNoDash): only SourceCharacters \ or ] or - must be escaped.
  • A non breaking space is allowed as (single or aditional) dateSeparator (\xa0)
  • For written Month name (m) folowing forms are possible: dm dmy m md my mdy

Olny year (1-4 digits) can detected as date. At detection of Parser for column this is put away as empty cell, if then 5 other cells found with date or data-sort-type="date" is written in table header, parser date is used.

date 1 date 2 date 3 Month and day date 5
2000 1. 1.00 1. Jan. 2000 1. Jan. Januar, 1 2000
2015[10][11] 10.1.2000[12] 10. Jan. 2000[13] 10. Jan.[14] Jan. 10 2000
about 2010 [15] about 2. 1. 2000 [16] Jan. 2000 [17] Jan. 2. [18] 01 22 2000
ca. 2020[Ref 1] ~ 2. 1. 2000[Ref 2] ca. Jan. 2000[Ref 3] ca. Jan.10.[19] 5.12.1990
ca.2030[20][21] ~ 2000[22] ca.2000[19] Jan December 12 '10
  • Following cells should not detected as date
name
Maier, Franz! Franz Maier
Metzujan
Jan Hofer
Mezelf14

time[edit]

Extended to format hhhh:mm:ss[.,]ssss ("932:20" was sorted in Firefox40 as 947968500000, in InternetExplorer8 as 0) see TOI (Time On Ice)

time 1 time 2 time 3
9:59:59 9:59:59,999 00:43 clock
ca. 8:00 pm[23] 9:59:61 00:26 Uhr
~ 20:00:00,001 [24] 9:59 932:20
8760:00[Ref 4] 0:00:00
10:00:00.5 am 10:00:00.01

number[edit]

  • digitTransformTable is used for standard and scientific format.
  • Text before number can have every character except -,+,−,digit (solve phab:T65055)
  • Numbers can contain spaces &#x20     and '
  • Add infinity as [+-−]∞
  • Scientific notation is possible, a number must stay before [·×⋅]10^? see fa:نماد_علمی Persian: Scientific notation
  • Empty cells will be sort at end, with Number.MAX_VALUE (only ∞ is larger)
  • Cells with text will be sort after number 10000 (solve phab:T123364)

set in header means that the parser is defined with data-sort-type="number".

all == 1 number 2 number 3 scientific 4 set number 5 text
1. place ~ 20 € 3 0 0 0'0 5.1 · 101 9999 abc[25]
100e-2[26] ~ 20.5€ -5e-323 5.0 × 101 4 ab[27][28]
1 apple $3[29] ca. ∞[30] 3.9 × 10 -3 A ab[31] c
about 1 about 1 m −∞[32] ۳,۹ × ۱۰−۳ ##[33] e09
10 ⋅ 10-1 1.1¢ 1.79e308 # e-09
References
  1. This references only demonstrate how sorting work
  2. 2
  3. 3
  4. 4
  5. 1
  6. 2
  7. 3
  8. 4
  9. 5
  10. 1
  11. 1
  12. 1
  13. 1
  14. 1
  15. 1
  16. 1
  17. 1
  18. 1
  19. 19.0 19.1 named Reference R1
  20. 1
  21. 4
  22. 1
  23. 1
  24. 2
  25. 4
  26. 10
  27. 12
  28. 13
  29. 1
  30. 1
  31. x
  32. 4
  33. 1
  1. Group Ref
  2. 1
  3. 1
  4. 1
  1. Remark