Search code examples
rubyencodingutf-8erubis

erubis: `scan': invalid byte sequence in US-ASCII (ArgumentError)


I ran into this error refactoring a small web-service. Interestingly, on the production-server, it works fine.

.../erubis-2.7.0/lib/erubis/converter.rb:132:in `scan': invalid byte sequence in US-ASCII (ArgumentError)
        from .../erubis-2.7.0/lib/erubis/converter.rb:132:in `convert_input'
        from .../erubis-2.7.0/lib/erubis/converter.rb:36:in `convert'
        from .../erubis-2.7.0/lib/erubis/engine.rb:30:in `initialize'

I run apache with ruby 2.3.3p222 (2016-11-21) [x86_64-linux-gnu] on a Debian 9 box.

Here is my code:

input = File.read('/somedir/chpwd.html')
eruby = Erubis::Eruby.new(input)

I have added #encoding: utf-8 to the top of my ruby-script and searched the web up and down, but found no resolution.


Solution

  • Turns out the problem is in the input. erubis tries to parse it as ASCII, but encounters some nasty chars and falls on its face.

    Unfortunately erubis does not comment on encoding-matters and thus I found the solution through this answer: Does Ruby provide a way to do File.read() with specified encoding?

    Tell ruby what encoding the data is during reading like so:

    File.read('/someir//chpwd.html', :encoding => 'utf-8')
    

    And erubis will handle it properly