Topic on VisualEditor/Feedback

Another reference error

16
Mx. Granger (talkcontribs)
Mx. Granger (talkcontribs)

See also this diff – looks like possibly a similar problem. Based on context, it looks like the editor tried to copy the ref named ":0", but VisualEditor added a "2" to create the ref name ":02".

Mx. Granger (talkcontribs)

Here too - it looks like the refs named ":62", ":622", ":623", and ":624" are all intended to be the same reference.

Matma Rex (talkcontribs)

Thank you for the report with examples, they are very helpful. I will have a look next week.

Matma Rex (talkcontribs)

I haven't gotten around to this, but I filed it as T312050 so that it doesn't get lost.

Mx. Granger (talkcontribs)

Not sure if this edit is the same issue, but it also introduced a bunch of undefined references by adding "2" to ref names:

FeRDNYC (talkcontribs)

@Mx. Granger: I suspect the issue there is that the references changed were previously "upside down": The named ref was invoked before it was defined. The first use of e.g. the "GadCast" ref was <ref name="GadCast"/>, and then it was defined in full the second time it was used.

So, when the V.E. saw a <ref name="GadCast"/> in the edited section, it saw it as a new, not-previously-defined reference (because there was no definition for it, up to that point), and noticed that the name conflicted with the definition of <ref name="GadCast" …> that appeared later in the article, so it applied automated corrections to the name.

Invoking references before defining them does work in MediaWiki, though it's definitely atypical page structure and I'm not sure where it falls in terms of "officially supported" syntax. But seems like, either VE needs to learn to support that ordering without freaking out, or invoking a reference before defining it needs to be flagged as a syntax error and tracked so that the (possibly many) articles where it's currently done can be corrected. Perhaps by an industrious bot. (Though I suppose a bot could correct those without them being made an error, really.)

Mx. Granger (talkcontribs)

@FeRDNYC: Thanks for the sleuthing on this. Many en.wikipedia articles invoke references before defining them, and in some cases this is the only feasible solution I know of to make a transclusion work correctly (specifically, this structure is needed when one page transcludes a section of another page that includes a reference also used earlier in the second page). I think VE needs to be fixed to support this ordering.

Matma Rex (talkcontribs)

Using a reference before it is defined is definitely 100% officially supported, and VE supports editing those references as well. It's not impossible that there are bugs, but I just tried some simple things and they work just fine, .


My best guess (I wrote it down on Phabricator: T312050#8082365 – I didn't think to copy it back to here, sorry) is that this happens because users copy-paste the wikitext for reusing a reference (<ref name="x" />) into VE, after editing the article in their sandbox, or from another article.

Mx. Granger (talkcontribs)

@Matma Rex Thanks for the update. That explanation makes sense, and I can see the problem would be tricky to prevent. (I'll try to remember to check for updates on Phabricator next time.)

Matma Rex (talkcontribs)

I had a closer look at the latest example ().

In this diff, the user was just trying to remove the {{Cast listing|…}} template wrapper, in addition to the other changes. The only way to do it is to copy-paste the wikitext out of the template. When I do this, I get the same diff with changed ref names that they got: https://phabricator.wikimedia.org/F35421710 (screencast).

I think that confirms my theory, but it also suggests that my proposed fix (just removing the broken refs) is not a good idea. I guess we should try to match the names to the existing ones, rather than filter them out. But this can lead to conflicts when refs are copy-pasted between articles… (particularly with the useless names like :1 generated by the visual editor itself). Maybe that is the lesser evil?

Mx. Granger (talkcontribs)

I'm genuinely not sure which is the lesser evil. As you pointed out, with the generic ref names generated by Visual Editor, trying to match the names could cause a very unfortunate conflict where a reference can end up attached to text that it has nothing to do with (if someone copies a <ref name=":1"/> invocation from one article to another article that already has an unrelated <ref name=":1">...</ref> definition). To be fair, this problem also happens when people copy and paste wikitext in the traditional editor.

A side question: is there any chance the Visual Editor could be changed to generate less generic ref names? I spend a lot of time fixing ref errors on en.wikipedia, and when the errors involve ref names like :1 and :2 that are so widely used across different pages, it can be much harder to untangle what has gone wrong.

Whatamidoing (WMF) (talkcontribs)

Decent automatic names is phab:T92432. This is difficult, because it must be language-agnostic. My favorite proposal so far is extracting some characters from the ref's contents.

A simpler first step could be phab:T52568.

Mx. Granger (talkcontribs)

I'm glad this is being worked on. Extracting characters from the ref's contents sounds reasonable. Nearly anything, even a randomly generated UUID or a hash code, would be better than the current setup, as there wouldn't be so many collisions when copying refs between articles.

Whatamidoing (WMF) (talkcontribs)

Well, it'd probably be more accurate to say that it's being "talked about occasionally". I don't have any real hope of this work getting done this calendar year.

Mx. Granger (talkcontribs)

After thinking about it some more, I think the current behavior may be the least bad option. With this behavior, an error is generated (putting the article into wikipedia:Category:Pages with broken reference names) and no information is lost, so as long as editors understand the issue, they can resolve it.

In contrast, if we try to match the ref names, we're vulnerable to the serious, silent problem with colliding ref names as mentioned above. If we remove the refs, information is lost so that it's not obvious the text was ever sourced. But I don't know, maybe there's some other solution that would avoid all of these problems.

In the case of the cast listing template – what if Visual Editor adds some kind of invisible control characters identifying the article when copying wikitext from a template within Visual Editor? Then when pasting the text into Visual Editor, it could check the control characters to see whether the text comes from the same article, then try to match the refs if it's the same article, and do the current behavior (adding a 2 to the ref name) if it's not the same article.

Reply to "Another reference error"