The question was re-written.
I'm working on a simple web framework, and encountered a strange behavior from either Rack or the Thin server I'm using.
I tried to simplify the config.ru file as much as I could to gain the following code which reproduces the strange problem:
app = Proc.new do |env|
content = "<p>عربي</p>"
headers = {'Content-Type' => 'html/text; charset=utf-8', 'Content-Length' => content.length.to_s}
[200, headers, [content]]
end
run app
The code above is a normal Rack process, with the content a HTML paragraph which contains an Arabic word of four letters. Now, running Thin server: thin start
, I was waiting for the source of the web page to be:
<p>عربي</p>
While it turned to be:
<p>عربي
Only, without the closing tag. The server works correctly if I inserted an English word instead of the Arabic one, so I concluded that the problem is related to the encoding or multibyte characters of Arabic.
I'm using Ruby 1.9.2. The encoding of the file is UTF-8. And Ruby works well if I just try puts "<p>عربي</p>"
in the console without the Rack or Thin server.
So, the problem is simply disappearing of a number of characters after the Arabic text when using Rack and Thin + the number of disappearing characters == the number of the Arabic characters in the text.
Any thoughts?
Does 'Content-Length' => content.bytesize.to_s
improve things?