Search code examples
regexapache.htaccessmod-rewriteurl-rewriting

.htaccess spaces in name


I am working on a project that uses .htaccess for redirecting www.example.com/Name to a profile page. I have it working fine on single word names. The issue is that if the name is "San Francisco" I get a 404 no matter what I try. Below is the line in my .htaccess.

RewriteRule ^([A-Za-z0-9]+)$ viewProfile.php?Name=$1 [L]

I am wondering if there is something that I can do so if I give the URL www.example.com/San_Francisco or something like that it would work. I read around some other questions that had to do with a similar topic but never got to a working solution.

(seems like www.example.com/San%20Francisco would be best because when I link that of course is what fills the URL space)


Solution

  • RewriteRule ^([A-Za-z0-9]+)$ viewProfile.php?Name=$1 [L]
    

    Your RewriteRule pattern does not include a space, so it will never match a space in the requested URL, eg. wwww.example.com/San%20Francisco (%20 being a URL encoded space).

    Note that although the space is URL encoded (%-encoded) in the request (in order to make valid request), the RewriteRule pattern matches against the %-decoded URL-path, ie. a literal space. A literal space must be backslash escaped in the regex (because spaces are delimiters in Apache config files). For example:

    RewriteRule ^([A-Za-z0-9\ ]+)$ viewProfile.php?Name=$1 [L]
    

    Alternatively, you can use the short-hand character class \s for any white-space character. Some would consider this easier to read (since you can't actually "see" spaces):

    RewriteRule ^([A-Za-z0-9\s]+)$ viewProfile.php?Name=$1 [L]
    

    Or, you can just use a non-escaped space and surround the entire pattern in double quotes:

    RewriteRule "^([A-Za-z0-9 ]+)$" viewProfile.php?Name=$1 [L]
    

    Note that the above patterns allow spaces at the start and end of the URL-path (profile name). These are obviously best avoided when the profile name is created in the first place.

    Needless to say, spaces are problematic in URLs and best avoided from the start. In the case of a "profile name", you would ideally create a separate "URL version" of the profile name that is used only in the URL, eg. all lowercase, convert spaces to hyphens: /san-francisco


    To include hyphens (-) in the pattern, these must be included at the start or end of the character class (since hyphens otherwise carry special meaning in a character class). For example:

    RewriteRule ^([A-Za-z0-9\s-]+)$ viewProfile.php?Name=$1 [L]
    

    To allow underscores (_) as well in the profile name (URL-path) then simply add the _ anywhere in the character class:

    RewriteRule ^([A-Za-z0-9_\s-]+)$ viewProfile.php?Name=$1 [L]
    

    Which is the same as:

    RewriteRule ^([\w\s-]+)$ viewProfile.php?Name=$1 [L]
    

    Using the short-hand character class \w to represent any word character ie. [a-zA-Z0-9_].


    UPDATE:

    If you are getting a 403 Forbidden response with the above rule when spaces are present in the requested URL-path then you will need to add the B flag to explicitly escape the backreference before it is applied to the substitution string.

    For example:

    RewriteRule ^([\w\s-]+)$ viewProfile.php?Name=$1 [B,L]
    

    This is the result of a recent "security" update to Apache that now rejects unencoded special characters in the query string (in the past they were implicitly encoded - so not an issue). See my answer to the following question for more information: AH10411 error: Managing spaces and %20 in apache mod_rewrite