Search code examples
rubyjsonopen-uri

Parse a web page and extract some json arrays


So I have some basic code below, which fetches the json from http://www.highcharts.com/demo/. But I want to be able to extract a hash, more specifically this:

series: [{
                    name: 'Tokyo',
                    data: [7.0, 6.9, 9.5, 14.5, 18.2, 21.5, 25.2, 26.5, 23.3, 18.3, 13.9, 9.6]
                }, {
                    name: 'New York',
                    data: [-0.2, 0.8, 5.7, 11.3, 17.0, 22.0, 24.8, 24.1, 20.1, 14.1, 8.6, 2.5]
                }, {
                    name: 'Berlin',
                    data: [-0.9, 0.6, 3.5, 8.4, 13.5, 17.0, 18.6, 17.9, 14.3, 9.0, 3.9, 1.0]
                }, {
                    name: 'London',
                    data: [3.9, 4.2, 5.7, 8.5, 11.9, 15.2, 17.0, 16.6, 14.2, 10.3, 6.6, 4.8]
                }]
            });

Into to a hash so that I can access the different data points. Currently the script just spits out everything. Code below:

require "json"
require "open-uri"


$LOAD_PATH << File.dirname(__FILE__)

result = JSON.parse(open("http://www.highcharts.com/demo/").read)

Solution

  • There are many ways to accomplish this task, a simple way: HTML -> JS -> JSON -> Ruby array.

    require 'open-uri'
    require 'json'
    
    html = open("http://www.highcharts.com/demo/").read
    js = html.match(/series: (\[\{.*?\}\])/m)[1]
    json = js.gsub(/(\w+):/i, '"\1":').gsub(/'/, '"')
    series = JSON.parse(json)
    # => [{"name"=>"Tokyo", "data"=>[7.0, 6.9, 9.5, 14.5, 18.2, ... 
    

    For a more orthodox approach (using a Javascript parser), check this question.