Manual talk:$wgLegalTitleChars

From mediawiki.org
Latest comment: 3 years ago by AKlapper (WMF) in topic UTF-8 Specification

This is a regex character class (i.e. a list of characters in a format suitable for a regular expression) that you want MediaWiki to allow in page titles despite being in the list of illegal characters.

Then why does the default value contain so many legal characters? Kevang 15:54, 20 February 2008 (UTC)Reply

Apparently, $wgLegalTitleChars doesn't actually override the illegal chars. If a char is not present and accounted for in $wgLegalTitleChars, it's not permitted in a title. See lines 1265 and 1358-1361 in http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Title.php?view=markup&pathrev=10960. Leaving this var blank would effectively cripple a site. Perhaps the description should be reworded to indicated that this var defines all legal title chars, but in no case overrides illegal chars? Brianko 22:19, 12 August 2010 (UTC)Reply

Consequences of allowing < >[edit]

What would happen if these were added? --JimHu 18:51, 20 February 2008 (UTC)Reply

Space character interfering with preg "x"-modifier[edit]

The space " " should probably be replaced with \\x20 to allow x-modifier usage without side-effects. --Danwe (talk) 01:40, 19 February 2012 (UTC)Reply

Character Substitution[edit]

  • would any one else out there like to see a nice way to drop a character (ie: "!" or "$") from the URL of a page?
  • perhaps using this configuration?
  • or is there another configuration setting that allows the sysadmin to easily do that
  • we have been experimenting with mod_rewrite for this (http://s.sjobeck.com/SFjwvR) but without much luck yet
  • thanks
  • --Sjobeck (talk) 18:14, 11 November 2012 (UTC)Reply

Unicode[edit]

How are Unicode characters in the title handled by MediaWiki? For example there is a Navier–Stokes equations page on Wikipedia, although the en dash is not matched by its $wgLegalTitleChars. Percent-encoding and HTML entities are also out of the question since they are explicitly disabled in the getTitleInvalidRegex function. -- Lahwaacz (talk) 11:48, 9 July 2016 (UTC)Reply

UTF-8 Specification[edit]

I believe that it should be specified that this character regex follows the UTF-8 standard, thus allowing characters such as emoji and Chinese characters (see: 🍆). If this regex were interpreted in the UTF-16 or other standards, then these characters wouldn't be allowed.

I would edit the page directly, but the warning on the page saying "Warning: Don't change this unless you know what you're doing!" makes me question my confidence in not screwing something up. If someone else could perform this change, I think it would significantly improve the utility of this page.

@Lexnasser: Hi, please sign your posts. Thanks! --AKlapper (WMF) (talk) 17:10, 13 April 2020 (UTC)Reply