Parsoid/Language conversion/Preprocessor fixups/Edit logbook

From mediawiki.org

This page (b)logs the editing process of the page-lists that were made on 20 March 2017. It is fluid, as in: edits done and developing insight. It is more flexible that its parent page, which should be more stable for future reference and documentation. Posts might be signed. -DePiep (talk) 17:16, 15 April 2017 (UTC)[reply]

The May 1 list[edit]

Parsoid/Language conversion/Preprocessor fixups/20170501

The March 20 list[edit]

Uotdated, newer list: 2017-05-01

On March 20, 2017, sixteen wikis were scanned for "-{" code in source text. They were listed per wiki. These are the pages (articles and non-articles) that might need to be fixed.

(Number of pages may be incorrect, +/- 10%)


Edit process[edit]

I am:

At home at enwiki I am a editor (TE, no admin).
For the enwiki list I could use my AutoWikiBrowser licence (en:WP:AWB), and of course individual page editing.

I know:

I am maintaining several chemicals templates (with that 'IUPAC name'), and that is why I arrived here.
I have no grasp of the MW side, the main issue (parsoid, LangConverter, parsing issues). I really edit into the blind for this. However. I do note that the name 'LanguageConverter' is misleading—Wikidata should do that! It is more like a script convertor ;-).

I do:

Only listed pages are approached.
Learning: Questions are gathered (e.g. what to do with .js, module pages)
All edits are manual (by individual page), or by AWB (check-before-save). No bot.
AWB is not a bot—I must check each edit individually (I do. Then, some are glancing, and some require aedit research check). This also may cause mistakes, at a typo level. Such an error once passing my own check, might be hard to find.
enwiki
First runs (500 P), got the issue, met questions
Other lang wikis
Low numbers: did per-page edit (sv, vi, war)
Higher numbers: will try to get AWB access for these lang-wikis, per local wiki.
zhwiki: the original list gives numbers, but today the page is redlinked. No action by me.
met more questions
Sister projects (like wikiquote)
mw is listed, will approach that one.
Other sister project: not listed, no action

Edit Rules[edit]

This set of Rules is developing by editing experience. April/May 2017.

Edit Rules (or guidelines) as applied:

Edits done in the listed pages
  • Change -{ into -<nowiki/>{
In chemical names. Mostly IUPAC names and similar; could be 75% of all affected pages.
When not a balanced full pair: -{...}, right hyphen missing so do edit. Example: en:Bulgarian language has "-{ost/est}". Warning: articles about language could very well have language converter code, so for these no edit.
In species description. Example: Oloo, G.W. (1975) Sugarcane. 1.-{Aulacaspis} spp. and other scales. (and unbalanced).
In module documentation pages. [[Module:.../doc]] pages are in Module namespace (so expect Lua code), but /doc pages have regular wikitext by wiki setup.
In language construct descriptions. Example: "[[{NounRoot}- ___ -{PosSuffix}", in en:Heuristic evaluation (and unbalanced).
Emoticon by character, like :-{
When used as show-the-template trick: {-<nowiki/>{Harvnb}}
  • -{ in url: change into %2D%7B (see en:Alan Turing) (see also discussion)
  • Removed when typo (for example |-{\n in wikitable pipe code)
  • Not restricted, do edit:
(see exception): When in static archive or log page (mostly before 2010 somehow). Keep unbroken page trumps static-ness. Should and may not change content. (So far, in enwiki and dewiki).
Except: when a log page is intended for automated reading (? no excample found).


Pending (rules to be looked at)
  • Protected page (example vi:Bản mẫu:Convert/Dual/LoffAoffDxSoffT) -- needs editrequest + aftercheck.
  • When in filename (imagename): "File:{Subject name in English} relief location map-{language}.svg". Could be both in wikitext or wl. (see example, could not test)
  • To be checked: in parsed Convert code (in multiple wikis). consider using brackets in: #expr: -() ..) etc. (see af:Sjabloon:Omreken/Duaal/KafAafVxEafT,. Marked 'issue' in the list).
Not edited
  • No edit when intended Language converter construct. Expect this in bi-script wikis like zhwiki. (expand this, list bi-wikis? Note that bi-language often means bi-script, like servian).
  • No edit when in ns=Module (Lua code); but did edit when .../doc page.
Tricky situation: "--{par="width", ..." is a comment line in Lua. Do not edit.
Note R
These rules might need Refinement, to select the true positives for editing. Could be when seeing the page.


Discussion[edit]

Better not %2D{: en:because opening curly-bracket is a reserved character in en:CS1, en:CS2 [1]
Note that Help:CS1 says that the { should escaped when it occurs in urls used in citation template parameters. So, only when a "-{" occurs in a url in a citation template parameter, do you need to escape both of those characters. But, if if the "-{" occurs in a url outside of that, it is sufficient to just escape "-". SSastry (WMF) (talk) 18:05, 11 May 2017 (UTC)[reply]
Would it hurt or break anything if I did it everywhere for the -{ sequence in an url? Any advice to restrain outside of enwiki? SSastry (WMF) -DePiep (talk) 18:09, 11 May 2017 (UTC)[reply]
No, it it safe to escape both always. I was just clarifying the guideline. SSastry (WMF) (talk) 18:12, 11 May 2017 (UTC)[reply]
If { is unsafe in citation template parameters, you should never find a -{ to begin with, right? It should already be -%7B, which should be fine as-is. cscott (talk) 18:23, 11 May 2017 (UTC)[reply]
Cscott, said it right: "should not find". I don't know how or if the enwiki CS-people have cleaned this up yet (they take errors in batches, by categorising etc.). Did find them in the Alan Turing example. Not an issue, I pick it up walking. -DePiep (talk) 19:17, 11 May 2017 (UTC)[reply]
User:DePiep, then -%7B in urls should handle both language variant and CS1/2 scenarios. SSastry (WMF) (talk) 12:30, 12 May 2017 (UTC)[reply]

Other advice[edit]

If you don't want to use <nowiki />, here are recommendations by User:Amire80:

  • Replace some formulas with <math> or <chem>.
  • Replace some minus characters with the ־ character, which is unique to the Hebrew alphabet (sorry, other alphabets). It should be used instead of a minus in those contexts anyway.
  • Add the {{Hyphen}} template, already available in English. It inserts a hyphen-minus as &#45;. It's a hack, but not worse than <nowiki />.

grep notes[edit]

Same class: template w:be-x-old:Шаблён:GeoTemplate has the url issue, and is listed in "n"on-articles (not "u").
Learned that this is to keep in mind: not all url-issues are listed on "u"-page. No change needed. -DePiep (talk) 08:19, 19 May 2017 (UTC)[reply]

Future[edit]