Jump to content

Topic on Project:Support desk/Flow

Short URLs not working after upgrade from 1.19 to 1.23

5
94.113.242.66 (talkcontribs)

I've just upgraded from 1.19 to 1.23 and since that the short URLs only works, if the title contains ASCII characters only. If there are czech national characters in the title, the page is not found, because the national characters are wrongly encoded. I can still open the pages using short URLs, when I enter the short URL directly into the navigation bar. So redirection in my IIS7 seems to work normally. If I disable the short URLs by $wgArticlePath, it works (except for showing images, but this is a different problem).

IIS7 logs:

'''1) short Url disabled'''
2014-06-09 12:09:23 192.168.5.203 GET /w/index.php title=Hlavn%C3%AD_strana 80 UNICONTROLS\kalas 10.10.2.45 Mozilla/5.0+(Windows+NT+6.1)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/35.0.1916.114+Safari/537.36 304 0 0 281

'''2) short Url enabled'''
2014-06-09 10:43:57 192.168.5.203 GET /w/index.php title=HlavnĂ_strana 80 UNICONTROLS\kalas 10.10.2.45 Mozilla/5.0+(Windows+NT+6.1)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/35.0.1916.114+Safari/537.36 404 0 0 140

From the logs it is clear, that if short URL is disabled the title is escaped, but if short URL is enabled, it is not escaped (and probably in wrong encoding).

94.113.242.66 (talkcontribs)

I found something related in the HISTORY file:

* `$wgUsePathInfo = true;` is no longer needed to make $wgArticlePath work on servers
 using like nginx, lighttpd, and apache over fastcgi. MediaWiki now always extracts
 path info from REQUEST_URI if it's available.

Which corresponds to this:

_SERVER["REQUEST_URI"] = "/HlavnĂ­_strana"
_SERVER["QUERY_STRING"]	= "title=Hlavn%C3%AD_strana"

It seems, that while the article name is correct in _SERVER["QUERY_STRING"] variable, the _SERVER["REQUEST_URI"] contains it invalid encoded.

94.113.242.66 (talkcontribs)

I'm still not further. What I only see, is that the original URL is processed by mediawiki and for some reason it is redirected to the wrong encoded URL. What PHP variable uses mediawiki to parse the title? Is it $_SERVER['REQUEST_URI']? Is WebRequest::getPathInfo related? I found that it uses parse_url(), but this is not UTF8 safe. My IIS/PHP server sets the $_SERVER['REQUEST_URI'] variable with UTF8 encoded URL. It is not escaped.

94.113.242.66 (talkcontribs)

I found a workaround:

add following line in LocalSettings.php:

$_SERVER['REQUEST_URI'] = urlencode($_SERVER['REQUEST_URI']);

Is there any configuration option in IIS or PHP to force the REQUEST_URI to be URL safe encoded?

Ciencia Al Poder (talkcontribs)

I've filed bug 66990 about this, although that change was introduced in 1.20