Search code examples
jsonrubyjekyllcontrol-characters

jekyll build error: 'control characters not allowed' in JSON file


I've written a plugin which writes JSON output to a file in the _data directory:

while current_page <= total_pages do

  url = 'https://web.consonance.app/api/v2/products.json'
  query = {
    'page' => "#{current_page}",
    'q[publishing_status_eq]' => '04' 
  }
  headers = {
    Authorization: "Token token=**************************"
  }
  request = HTTParty.get(url, query: query, headers: headers)

  hash = JSON.parse(request.body)

  hash['products'].each do |item|
    product_array.push(item)
  end

  current_page += 1

end

# open products.json in data dir and write array output converted from hash back to JSON

File.open("./_data/products.json", "w") { |file| 
  file.puts JSON.pretty_generate(product_array)
}

which puts the desired output as a JSON array in the _data directory with the following format:

[
  {
    "id": 100,
    "work_id": 50,
    "full_title": "Title #1"
  },
  {
    "id": 101,
    "work_id": 51,
    "full_title": "Title #2"
  }
]

When I try to build my site, I get the error:

jekyll 3.8.5 | Error:  (/Users/jamiebowman/Documents/web dev/jekyll/press/_data/products.json): control characters are not allowed at line 1 column 1

When I remove the square brackets at the beginning and the end of the JSON file, then the site builds, but I cannot properly access the data without it being an array.

What are control characters in this context and why are they stopping the site from building?

Traceback errors

Traceback (most recent call last):
        30: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/ruby_executable_hooks:24:in `<main>'
        29: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/ruby_executable_hooks:24:in `eval'
        28: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/jekyll:23:in `<main>'
        27: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/bin/jekyll:23:in `load'
        26: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/exe/jekyll:15:in `<top (required)>'
        25: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary.rb:19:in `program'
        24: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/program.rb:42:in `go'
        23: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `execute'
        22: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `each'
        21: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `block in execute'
        20: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:75:in `block (2 levels) in init_with_program'
        19: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `start'
        18: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `each'
        17: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/serve.rb:93:in `block in start'
        16: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-livereload-0.2.2/lib/jekyll-livereload/build.rb:30:in `process'
        15: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/build.rb:36:in `process'
        14: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/commands/build.rb:65:in `build'
        13: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/command.rb:28:in `process_site'
        12: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/site.rb:69:in `process'
        11: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/site.rb:164:in `read'
        10: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/reader.rb:18:in `read'
         9: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:20:in `read'
         8: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:38:in `read_data_to'
         7: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:38:in `each'
         6: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:46:in `block in read_data_to'
         5: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/jekyll-3.8.5/lib/jekyll/readers/data_reader.rb:68:in `read_data_file'
         4: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:157:in `load_file'
         3: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:157:in `open'
         2: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:157:in `block in load_file'
         1: from /Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:143:in `load'
/Users/jamiebowman/.rvm/gems/ruby-2.6.3/gems/safe_yaml-1.0.5/lib/safe_yaml/load.rb:143:in `parse': (/Users/jamiebowman/Documents/web dev/jekyll/press/_data/products.json): control characters are not allowed at line 1 column 1 (Psych::SyntaxError)

Solution

  • I'm on the team that maintains the API you're calling, and had the same error. It is to do with non-ASCII characters being included in the response. You can sanitize them like this:

    problematic_string.encode(Encoding.find('ASCII'), encoding_options)
    

    where encoding_options are

      def encoding_options
        {
          :invalid           => :replace,  # Replace invalid byte sequences
          :undef             => :replace,  # Replace anything not defined in ASCII
          :replace           => '',        # Use a blank for those replacements
          :universal_newline => true       # Always break lines with \n
        }
      end
    

    source: How to get rid of non-ascii characters in ruby

    The problematic_string will likely be a long text such as a review, production blurb or other descriptive text.