Search code examples
phpzend-framework2http-status-code-404zend-routezend-router

Route mit special characters are not parsed correctly in Zend Framework 2


URIs with german special characters don't work (error 404). I've already had this problem (here) and it has been resolved with the unicode modifier and a custom view helper, that uses it.

Now I have the same issue with a Segment child route, but this time the approach with the unicode identifier and a custom view helper isn't working.

Alle requests like sld.tld/sport/sportäöüÄÖÜß/cityäöüÄÖÜß or sld.tld/sport/sportäöüÄÖÜß/cityäöüÄÖÜß/page/123 are ending with a 404 error.

/module/Catalog/config/module.config.php

<?php
return array(
    ...
    'router' => array(
        'routes' => array(
            'catalog' => array(
                ...
            ),
            'city' => array(
                ...
            ),
            // works correctly, if I remove the child route
            'sport' => array(
                'type'  => 'MyNamespace\Mvc\Router\Http\UnicodeRegex',
                'options' => array(
                    'regex' => '/catalog/(?<city>[\p{L}\p{Zs}]*)/(?<sport>[\p{L}\p{Zs}]*)',
                    'defaults' => array(
                        'controller' => 'Catalog\Controller\Catalog',
                        'action'     => 'list-courses',
                    ),
                    'spec'  => '/catalog/%city%/%sport%',
                ),
                'may_terminate' => true,
                'child_routes' => array(
                    'courses' => array(
                        'type'  => 'segment',
                        'options' => array(
                            'route' => '[/page/:page]',
                            'defaults' => array(
                                'controller' => 'Catalog\Controller\Catalog',
                                'action'     => 'list-courses',
                            ),
                        ),
                        'may_terminate' => true,
                    ),
                )
            ),
        ),
    ),
    ...
);

I've also tried it with a UnicodeRegex child route:

        'sport' => array(
            'type'  => 'MyNamespace\Mvc\Router\Http\UnicodeRegex',
            'options' => array(
                'regex' => '/catalog/(?<city>[\p{L}\p{Zs}]*)/(?<sport>[\p{L}\p{Zs}]*)',
                'defaults' => array(
                    'controller' => 'Catalog\Controller\Catalog',
                    'action'     => 'list-courses',
                ),
                'spec'  => '/catalog/%city%/%sport%',
            ),
            'may_terminate' => true,
            'child_routes' => array(
                'courses' => array(
                    'type'  => 'MyNamespace\Mvc\Router\Http\UnicodeRegex',
                    'options' => array(
                        'regex' => '/page/(?<page>[\p{N}]*)',
                        'defaults' => array(
                            'controller' => 'Catalog\Controller\Catalog',
                            'action'     => 'list-courses',
                        ),
                        'spec'  => '/page/%page%',
                    ),
                    'may_terminate' => true,
                ),
            )
        ),

UnicodeRegex

see here

UnicodeSegment

Extends Zend\Mvc\Router\Http\Segment and completes the input of ALL preg_match(...) calls with u:

How to get it working?


Solution

  • Just had a look at this, you need to change the UnicodeRegex match method so that it returns the correct length for the part of the url it matched, here's an attempt to fix that, which seems to be working (at least for me) with your setup

    public function match(Request $request, $pathOffset = null)
    {
        if (!method_exists($request, 'getUri')) {
            return null;
        }
    
        $uri  = $request->getUri();
        $path = rawurldecode($uri->getPath());
    
        if ($pathOffset !== null) {
            $result = preg_match('(\G' . $this->regex . ')u', $path, $matches, null, $pathOffset);
        } else {
            $result = preg_match('(^' . $this->regex . '$)u', $path, $matches);
        }
    
        if (!$result) {
            return null;
        }
    
        foreach ($matches as $key => $value) {
            if (is_numeric($key) || is_int($key) || $value === '') {
                unset($matches[$key]);
            } else {
                $matches[$key] = rawurldecode($value);
            }
        }
    
        // at this point there's a mismatch between the length of the rawurlencoded path
        // that all other route helpers use, so we need to match their expectations
        // to do that we build the matched part from the spec, using the matched params 
        $url = $this->spec;
        $mergedParams = array_merge($this->defaults, $matches);
        foreach ($mergedParams as $key => $value) {
            $spec = '%' . $key . '%';
            if (strpos($url, $spec) !== false) {
                $url = str_replace($spec, rawurlencode($value), $url);
            }
        }
        // make sure the url we built from spec exists in the original uri path
        if (false === strpos($uri->getPath(), $url)) {
            return null;
        }
        // now we can get the matchedLength
        $matchedLength = strlen($url);
    
        return new RouteMatch($mergedParams, $matchedLength);
    }