Search code examples
ruby-on-rails-4utf-8rails-consoleargument-error

Ruby on Rails 4: Occasional "invalid byte sequence in UTF-8 (ArgumentError)" when running rails console or rails server


This is my first time using stack overflow for a personal question and I have searched for an answer to my question, with no success, so please be patient with me if I have overlooked anything, and thank you in advance for your help.

I'm currently making an application using ruby on rails 4 version 4.1.1 (using RVM) and it seems that every time I enter any rake or rails command (such as rails server or rails console) in the command line, there is a 50/50 chance that it will work as planned, the rest of the time I get the following error message:

/Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler/runtime.rb:222:in `split': invalid byte sequence in UTF-8 (ArgumentError)
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler/runtime.rb:224:in `setup_environment'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler/runtime.rb:17:in `setup'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler.rb:120:in `setup'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems-bundler/noexec.rb:94:in `setup'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems-bundler/noexec.rb:124:in `check'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems-bundler/noexec.rb:131:in `<top (required)>'
from /Users/drobro/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:135:in `require'
from /Users/drobro/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:135:in `rescue in require'
from /Users/drobro/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:144:in `require'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems_executable_plugin.rb:4:in `block in <top (required)>'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:50:in `call'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:50:in `block in run'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:49:in `each'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:49:in `run'
from /Users/drobro/.rvm/gems/ruby-2.1.2/bin/ruby_executable_hooks:10:in `<main>'

Now, I went to check the apparently faulty code in the runtime.rb and it looks like this:

def setup_environment
  begin
    ENV["BUNDLE_BIN_PATH"] = Bundler.rubygems.bin_path("bundler", "bundle", VERSION)
  rescue Gem::GemNotFoundException
    ENV["BUNDLE_BIN_PATH"] = File.expand_path("../../../bin/bundle", __FILE__)
  end

  # Set PATH
  paths = (ENV["PATH"] || "").split(File::PATH_SEPARATOR)
  paths.unshift "#{Bundler.bundle_path}/bin"
  ENV["PATH"] = paths.uniq.join(File::PATH_SEPARATOR)

  # Set BUNDLE_GEMFILE
  ENV["BUNDLE_GEMFILE"] = default_gemfile.to_s

  # Set RUBYOPT
  rubyopt = [ENV["RUBYOPT"]].compact
  if rubyopt.empty? || rubyopt.first !~ /-rbundler\/setup/
    rubyopt.unshift %|-rbundler/setup|
    ENV["RUBYOPT"] = rubyopt.join(' ')
  end

  # Set RUBYLIB
  rubylib = (ENV["RUBYLIB"] || "").split(File::PATH_SEPARATOR)
  rubylib.unshift File.expand_path('../..', __FILE__)
  ENV["RUBYLIB"] = rubylib.uniq.join(File::PATH_SEPARATOR)
end

at line 222, which is the line right beneath the # Set PATH comment, i.e. paths = (ENV["PATH"] || "").split(File::PATH_SEPARATOR). From what I understand, this is telling me that the argument to the split method, File::PATH_SEPARATOR, is invalid in UTF-8 encoding. I decided to throw in some puts statements around that code to check what was going on. So, right under # Set PATH, I typed:

puts "File::PATH_SEPARATOR is this: #{File::PATH_SEPARATOR}"
puts "This is the encoding: #{File::PATH_SEPARATOR.encoding}"
File::PATH_SEPARATOR.each_byte do |c|
    puts "This is the ASCII value: #{c}"
end

ON THE TIMES WHEN A RAILS COMMAND DOES NOT WORK, the output to the terminal is:

File::PATH_SEPARATOR is this: :  
This is the encoding: ASCII-8BIT
This is the ASCII value: 58
/Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler/runtime.rb:227:in `split': invalid byte sequence in UTF-8 (ArgumentError)
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler/runtime.rb:224:in `setup_environment'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler/runtime.rb:15:in `setup'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib/bundler.rb:120:in `setup'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems-bundler/noexec.rb:94:in `setup'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems-bundler/noexec.rb:124:in `check'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems-bundler/noexec.rb:131:in `<top (required)>'
from /Users/drobro/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:135:in `require'
from /Users/drobro/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:135:in `rescue in require'
from /Users/drobro/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:144:in `require'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/rubygems-bundler-1.4.4/lib/rubygems_executable_plugin.rb:4:in `block in <top (required)>'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:50:in `call'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:50:in `block in run'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:49:in `each'
from /Users/drobro/.rvm/gems/ruby-2.1.2@global/gems/executable-hooks-1.3.2/lib/executable-hooks/hooks.rb:49:in `run'
from /Users/drobro/.rvm/gems/ruby-2.1.2/bin/ruby_executable_hooks:10:in `<main>'

AND ON THE TIMES WHEN A RAILS COMMAND DOES WORK, the output to the terminal is (this example is for the rails server command):

File::PATH_SEPARATOR is this: :
This is the encoding: ASCII-8BIT
This is the ASCII value: 58
File::PATH_SEPARATOR is this: :
This is the encoding: ASCII-8BIT
This is the ASCII value: 58
=> Booting WEBrick
=> Rails 4.1.0 application starting in development on http://0.0.0.0:3000
=> Run `rails server -h` for more startup options
=> Notice: server is listening on all interfaces (0.0.0.0). Consider using 127.0.0.1 (--binding option)
=> Ctrl-C to shutdown server
[2014-07-08 17:36:40] INFO  WEBrick 1.3.1
[2014-07-08 17:36:40] INFO  ruby 2.1.2 (2014-05-08) [x86_64-darwin13.0]
[2014-07-08 17:36:40] INFO  WEBrick::HTTPServer#start: pid=6447 port=3000

This is what worries me: the information returned is IDENTICAL in both cases. What's worse is that the encoding is ASCII-8BIT, which is more restrictive than UTF-8, and anyways the invalid character is supposedly just a colon... which should never cause any problems in either of those encodings right?? So I have 2 questions:

1) Why in the world am I getting this invalid utf-8 error?

2) Why does it only happen half the time despite the input being identical??

Thank you for helping, I'm at a loss here.


Solution

  • I managed to fix the problem by changing the line:

    paths = (ENV["PATH"] || "").split(File::PATH_SEPARATOR)
    

    to:

    paths = (ENV["PATH"] || "").encode('UTF-8', :invalid => :replace).split(File::PATH_SEPARATOR)
    

    My understanding is that this will replace an invalid UTF-8 byte sequence with (an equivalent??) one that is valid. However, this does not explain why the problem was happening just half the time, and that is what was really getting to me.

    So, if anyone reads this and has any clue as to what was happening, please feel free to comment and let me know, it would be very much appreciated.