On my Local machine I can search for "Härtefälle" which will result in the following URL:
http://myapp.dev/de/incoming?q=H%E4rtef%E4llen
I can submit as many times as I want, it always looks correct:
Info:
Mac OSX 10.9.5
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]
thinking-sphinx (3.1.1)
rails (4.0.4)
/usr/local/Cellar/sphinx/2.2.4
locale
command:
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
However on my production environment when I enter the search term and click "Apply", I get the following result:
curiously when I keep pressing Apply, the term gets bigger and weirder, but somehow the search engine is still able to see the term "Härtefällen" behind this weird HÃÂâ¬rtefÃÂâ¬llen
because the corresponding search result is displayed:
Info:
Debian 7.0
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
rails (4.0.4)
thinking-sphinx (3.1.1)
Package: sphinxsearch Version: 2.0.4-1.1
locale
command:
LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
The only thing I do in my controller is unescaping the search params H%E4rtef%E4llen
:
# TODO: Somehow `René` turns into `Ren\xE4`
params[:q] = params[:q].encode('UTF-8', 'ISO-8859-15') rescue nil
Now how do I get the sane behaviour on production? Please let me know if I can provide any more relevant information.
I figured out what I was doing wrong:
In Step 1
the characters are properly encoded, but for step 2
where I form a new URL I need to escape the URL using URI.encode
:
URI.encode(myURL)
So that e.g. ö
turns into %C3%B6