Search code examples
javascriptjsonapilamp

How to call REC Registery API and store returned JSONs into some kind of database


I'd like to break this into smaller, tighter questions but I don't know what I don't know enough to do that yet. So hopefully a can get specific answers to help do that.

The scope of the solution requires receiving & parsing a lot of records, 2013 had ~17 million certificate(s) transactions while I'm only interested in very small subsets of the order 40,000 records.

In pseudo code:

  • iterate dates(thisDate)
    • send message to API for thisDate
    • receive JSONS as todaysRecords
    • examine todaysRecords to look for whatever criteria match inside the structure
    • append a subset of todaysRecords to recordsOut
  • save recordsOut to a SQL/CSV file.

There's a large database of Renewable Energy Certificates for the use under the Australian Government RET Scheme called the REC Registery and as well as the web interface linked to here, there is an API provided that has a simple call logic as follows

http://rec-registry.gov.au/rec-registry/app/api/public-register/certificate-actions?date=<user provided date> where:

  • The date part of the URL should be provided by the user
  • Date format should be YYYY-MM-DD (no angle brackets & 1 date limit)

A JSON is returned (with potentially 100,000s of records on each day).

The API documentation (13pp PDF) is here, but it mainly goes into explaining the elements of the returned structure which is less relevant to my question. Includes two sample JSON responses.

While I know some Javascript (mostly not in a web context) I'm not sure how send this message within a script and figure I'd need to do it server side to be able to process (filter) the returned information and then save the records I'm interested in. I'll have no issue parsing the JSON (if i can use JS) and copying the objects I wish to save I'm not sure where to even start doing this. Do I need a LAMP setup to do this (or MAMP since I'm on OS X) or is there a more light-weight JS way I can execute this. I've never known how to save file from within web-browser JS, I thought it was banned for security reasons but I guess theres ways and means.

If i can rewrite this question to be more clear and effective in soliciting an answer I'm happy for edits to question also.

I guess maybe I'm after some boilerplate code for calling a simple API like this and the stack or application context in which I need to do it. I realise there's potential several ways to execute this but looking for most straightforward for someone with JS knowledge and not much PHP/Python experience (but willing to learn what it takes).

Easy right?


Solution

  • Ok, to point you in the right direction.

    Requirements

    If the language of choice is Javascript, you'll need to install Node.js. No server whatsoever needed.

    Same is valid for PHP or Python or whatever. No apache needed, just the lang int.

    Running a script with node

    Create a file.js somewhere. To run it, you'll just need to type (in the console) node file.js (in the directory the file lives in.

    Getting the onfo from the REC Webservice

    Here's an example of a GET request:

    var https = require('https');
    var fs = require('fs');
    var options = {
        host: 'rec-registry.gov.au',
        port: 443,
        path: '/rec-registry/app/api/public-register/certificate-actions?date=2015-06-03'
    };
    var jsonstr = '';
    
    var request = https.get(options, function(response) {
      process.stdout.write("downloading data...");
      response.on('data', function (chunk) {
        process.stdout.write(".");
        jsonstr += chunk;
      });
    
      response.on('end', function () {
        process.stdout.write("DONE!");
        console.log(' ');
        console.log('Writing to file...');
        fs.writeFile("data.json", jsonstr, function(err) {
          if(err) {
            return console.error('Error saving file'); 
          }
        console.log('The file was saved!');
        });
      });
    })
    
    request.on('error', function(e) {
      console.log('Error downloading file: ' + e.message);
    });
    

    Transforming a json string into an object/array

    use JSON.parse

    Parsing the data

    examine todaysRecords to look for whatever criteria match inside the structure

    Can't help you there, but should be relatively straightforward to look for the correct object properties.

    NOTE: Basically, what you get from the request is a string. You then parse that string with

    var foo = JSON.parse(jsonstr)
    

    In this case foo is an object. The results "certificates" are actually inside the property result, which is an array

    var results = foo.result;
    

    In this example the array contains about 1700 records and the structure of a certificate is something like this:

    "actionType": "STC created",
    "completedTime": "2015-06-02T21:51:26.955Z",
    "certificateRanges": [{
      "certificateType": "STC",
      "registeredPersonNumber": 10894,
      "accreditationCode": "PVD2259359",
      "generationYear": 2015,
      "generationState": "QLD",
      "startSerialNumber": 1,
      "endSerialNumber": 72,
      "fuelSource": "S.G.U. - solar (deemed)",
      "ownerAccount": "Solargain PV Pty Ltd",
      "ownerAccountId": 25782,
      "status": "Pending audit"
    }]
    

    So, to access, for instance, the "ownerAccount" of the first "certificateRanges" of the first "certificate" you would do:

    var results = JSON.parse(jsonstr).result;
    
    var ownerAccount = results[0].certificateRanges[0].ownerAccount;
    

    Creating a csv

    The best way is to create an abstract structure (that meets your needs) and convert it to a csv.

    There's a good npm library called json2csv that can help you there

    Example:

    var fs = require('fs');
    var json2csv = require('json2csv'); 
    var fields = ['car', 'price', 'color']; // csv titles
    var myCars = [
      {
        "car": "Audi",
        "price": 40000,
        "color": "blue"
      }, {
        "car": "BMW",
        "price": 35000,
        "color": "black"
      }, {
        "car": "Porsche",
        "price": 60000,
        "color": "green"
      }
    ];
    
    json2csv({ data: myCars, fields: fields }, function(err, csv) {
      if (err) console.log(err);
      fs.writeFile('file.csv', csv, function(err) {
        if (err) throw err;
        console.log('file saved');
      });
    });
    

    If you wish to append instead of writing to a new file you can use

    fs.appendFile('file.csv', csv, function (err) { });