Difference between revisions of "MediaWiki 1.11 title extraction bug"
m (WebRequest.php moved to MediaWiki 1.11 title extraction bug) |
(Almost working) |
||
| Line 1: | Line 1: | ||
| + | == The problem == | ||
There has been trouble upgrading to MediaWiki 1.11 and this article has been set up to document my investigation in to the problem. The symptom is that if ''$wgArticlePath'' is set to "/$1" then any long-form URL requests using ''title'' as a query-string parameter will fail and be redirected to a non-existent article called ''Wiki/index.php'', i.e. it's treating the long-form URL as a friendly URL. | There has been trouble upgrading to MediaWiki 1.11 and this article has been set up to document my investigation in to the problem. The symptom is that if ''$wgArticlePath'' is set to "/$1" then any long-form URL requests using ''title'' as a query-string parameter will fail and be redirected to a non-existent article called ''Wiki/index.php'', i.e. it's treating the long-form URL as a friendly URL. | ||
| + | == The cause == | ||
There have been some significant changes to the way the article title is extracted from the request in version 1.11. A new method called ''interpolateTitle'' has been added to the ''$wgRequest'' singleton object which is defined in ''includes/WebRequest.php'' and is called from ''includes/Setup.php''. | There have been some significant changes to the way the article title is extracted from the request in version 1.11. A new method called ''interpolateTitle'' has been added to the ''$wgRequest'' singleton object which is defined in ''includes/WebRequest.php'' and is called from ''includes/Setup.php''. | ||
| Line 76: | Line 78: | ||
} | } | ||
</php> | </php> | ||
| + | |||
| + | == Solution == | ||
| + | One thing the new 1.11 code shows is that the problem can only occur when the ''$wgUsePathInfo'' global is set to ''true'', so I've set that to ''false'' and put the ''$wgArticlePath'' back to "/$1" which has got our friendly URL's working again. The one drawback of this is that without ''$wgUsePathInfo'', our rewrite rule must translate all friendly requests to the full long-form query-string which means that un-encoded ampersands are translated as query-string separators and cannot be used in article titles, but at least our friendly URL's are working again. | ||
Revision as of 04:45, 23 September 2007
The problem
There has been trouble upgrading to MediaWiki 1.11 and this article has been set up to document my investigation in to the problem. The symptom is that if $wgArticlePath is set to "/$1" then any long-form URL requests using title as a query-string parameter will fail and be redirected to a non-existent article called Wiki/index.php, i.e. it's treating the long-form URL as a friendly URL.
The cause
There have been some significant changes to the way the article title is extracted from the request in version 1.11. A new method called interpolateTitle has been added to the $wgRequest singleton object which is defined in includes/WebRequest.php and is called from includes/Setup.php.
Here is the old 1.10 title extraction code which was handled directly in $wgRequest's constructor: <php> if ( $wgUsePathInfo ) { if ( isset( $_SERVER['ORIG_PATH_INFO'] ) && $_SERVER['ORIG_PATH_INFO'] != ) { # Mangled PATH_INFO # http://bugs.php.net/bug.php?id=31892 # Also reported when ini_get('cgi.fix_pathinfo')==false $_GET['title'] = $_REQUEST['title'] = substr( $_SERVER['ORIG_PATH_INFO'], 1 ); } elseif ( isset( $_SERVER['PATH_INFO'] ) && ($_SERVER['PATH_INFO'] != ) && $wgUsePathInfo ) { $_GET['title'] = $_REQUEST['title'] = substr( $_SERVER['PATH_INFO'], 1 ); } } </php> And here is the new interpolateTitle method which gets called from includes/Setup.php. All title extraction code has been removed from the $wgRequest constructor and replaced with this new method. <php> /**
* Check for title, action, and/or variant data in the URL * and interpolate it into the GET variables. * This should only be run after $wgContLang is available, * as we may need the list of language variants to determine * available variant URLs. */
function interpolateTitle() { global $wgUsePathInfo; if ( $wgUsePathInfo ) { // PATH_INFO is mangled due to http://bugs.php.net/bug.php?id=31892 // And also by Apache 2.x, double slashes are converted to single slashes. // So we will use REQUEST_URI if possible. $matches = array(); if ( !empty( $_SERVER['REQUEST_URI'] ) ) { // Slurp out the path portion to examine... $url = $_SERVER['REQUEST_URI']; if ( !preg_match( '!^https?://!', $url ) ) { $url = 'http://unused' . $url; } $a = parse_url( $url ); if( $a ) { $path = $a['path'];
global $wgArticlePath; $matches = $this->extractTitle( $path, $wgArticlePath );
global $wgActionPaths; if( !$matches && $wgActionPaths) { $matches = $this->extractTitle( $path, $wgActionPaths, 'action' ); }
global $wgVariantArticlePath, $wgContLang; if( !$matches && $wgVariantArticlePath ) { $variantPaths = array(); foreach( $wgContLang->getVariants() as $variant ) { $variantPaths[$variant] = str_replace( '$2', $variant, $wgVariantArticlePath ); } $matches = $this->extractTitle( $path, $variantPaths, 'variant' ); } } } elseif ( isset( $_SERVER['ORIG_PATH_INFO'] ) && $_SERVER['ORIG_PATH_INFO'] != ) { // Mangled PATH_INFO // http://bugs.php.net/bug.php?id=31892 // Also reported when ini_get('cgi.fix_pathinfo')==false $matches['title'] = substr( $_SERVER['ORIG_PATH_INFO'], 1 );
} elseif ( isset( $_SERVER['PATH_INFO'] ) && ($_SERVER['PATH_INFO'] != ) ) { // Regular old PATH_INFO yay $matches['title'] = substr( $_SERVER['PATH_INFO'], 1 ); } foreach( $matches as $key => $val) { $_GET[$key] = $_REQUEST[$key] = $val; } } } </php>
Solution
One thing the new 1.11 code shows is that the problem can only occur when the $wgUsePathInfo global is set to true, so I've set that to false and put the $wgArticlePath back to "/$1" which has got our friendly URL's working again. The one drawback of this is that without $wgUsePathInfo, our rewrite rule must translate all friendly requests to the full long-form query-string which means that un-encoded ampersands are translated as query-string separators and cannot be used in article titles, but at least our friendly URL's are working again.



