Extension talk:InlineEditor/Prototypes

From mediawiki.org

Sentences, lines or paragraphs[edit]

(Apologies if this is not the appropriate place for feedback, I couldn't see any suggested location.)

The prototypes look very impressive.

I would, though, like to ask about your choice of sentences as the proposed editing unit.

How will you define sentence extents when periods are often used within sentences to terminate abbreviations and for other purposes?

Would lines not be an easier unit to parse, and without greatly increasing confusion to new editors? (I use "line" here to mean wikitext between CR/LF control characters.) I believe that Mediawiki parses bold and italic formatting at line level. Lines also define the semantic units within lists, headings and wikitable markup (lines beginning with ":", ";", "*", "#", "==...==" or "{|"; and wikitable lines beginning with "!" or "|").

One disadvantage would be that an existing sentence can contain hidden CR/LF codes within its wikitext. But it could be argued that this is semantically incorrect, or at least undesirable, even though it is rendered as one unit.

More radically, paragraphs (i.e. wikitext between empty lines) would be intuitive editing units for ordinary prose, because readers are used to seeing text grouped within paragraphs (except in tables, templates, lists, headings and some tags). But this could be more complicated to code, given the inline- and block-level formatting that is currently parsed at line level (as described in the previous paragraph). Of course, in some cases paragraph-level editing would nearly replicate the existing section-level functionality.

Richardguk 23:33, 10 August 2010 (UTC)Reply

Maybe the mailing list is a better place for this. But this is an excellent suggestion indeed, the best I've got so far I think. The w:Sentence Boundary Disambiguation is quite hard and still a topic of research. Lines might be much better, but do keep in mind that a lot of articles (at least as far as I could see), don't split paragraphs into lines. Usually a paragraph is just one long line.
Paragraphs are unacceptable I think, because there is too much scary wikitext going on within one paragraph, i.e. references and images. Maybe a paragraph option would be nice for advanced users, but I feel that the sentence/line concept should be there for novice users.
--JanPaul123 12:30, 15 August 2010 (UTC)Reply
Thanks for the feedback.
Multiple sentences of prose, in themselves, aren't a problem, even for basic editors.
Conversely, templates and references are often contained within sentences, so editing at this lower level does not solve the problem.
So maybe the advantages of a sentence-boundary algorithm (which will always fail for some cases) are minimal.
Perhaps a better way to hide complexity in an interface for basic editors would be to display only the output of template and ref code (in a distinctive colour), making the wikitext uneditable except where the editor opts into a more advanced interface.
It's true that most prose paragraphs comprise a single line of wikitext (using the above definitions).
Richardguk 13:58, 15 August 2010 (UTC)Reply
References are almost always after the sentences. In my current implementation references will fall outside of the sentence definitions.
What you say about making templates and references uneditable is exactly what I'm proposing with the different edit modes.
So I think that we're basically thinking the same thing.
By the way, a sentence-boundary algorithm and line breaking can go together hand in hand. So I can make it that when there is a line break (or double spaces, which is occasionally used too), there is a forced sentence-boundary.
--JanPaul123 15:19, 15 August 2010 (UTC)Reply