I'd like to break this into smaller, tighter questions but I don't know what I don't know enough to do that yet. So hopefully a can get specific answers to help do that.
The scope of the solution requires receiving & parsing a lot of records, 2013 had ~17 million certificate(s) transactions while I'm only interested in very small subsets of the order 40,000 records.
In pseudo code:
There's a large database of Renewable Energy Certificates for the use under the Australian Government RET Scheme called the REC Registery and as well as the web interface linked to here, there is an API provided that has a simple call logic as follows
http://rec-registry.gov.au/rec-registry/app/api/public-register/certificate-actions?date=<user provided date>
where:
A JSON is returned (with potentially 100,000s of records on each day).
The API documentation (13pp PDF) is here, but it mainly goes into explaining the elements of the returned structure which is less relevant to my question. Includes two sample JSON responses.
While I know some Javascript (mostly not in a web context) I'm not sure how send this message within a script and figure I'd need to do it server side to be able to process (filter) the returned information and then save the records I'm interested in. I'll have no issue parsing the JSON (if i can use JS) and copying the objects I wish to save I'm not sure where to even start doing this. Do I need a LAMP setup to do this (or MAMP since I'm on OS X) or is there a more light-weight JS way I can execute this. I've never known how to save file from within web-browser JS, I thought it was banned for security reasons but I guess theres ways and means.
If i can rewrite this question to be more clear and effective in soliciting an answer I'm happy for edits to question also.
I guess maybe I'm after some boilerplate code for calling a simple API like this and the stack or application context in which I need to do it. I realise there's potential several ways to execute this but looking for most straightforward for someone with JS knowledge and not much PHP/Python experience (but willing to learn what it takes).
Easy right?
Ok, to point you in the right direction.
If the language of choice is Javascript, you'll need to install Node.js. No server whatsoever needed.
Same is valid for PHP or Python or whatever. No apache needed, just the lang int.
Create a file.js
somewhere. To run it, you'll just need to type (in the console) node file.js
(in the directory the file lives in.
Here's an example of a GET request:
var https = require('https');
var fs = require('fs');
var options = {
host: 'rec-registry.gov.au',
port: 443,
path: '/rec-registry/app/api/public-register/certificate-actions?date=2015-06-03'
};
var jsonstr = '';
var request = https.get(options, function(response) {
process.stdout.write("downloading data...");
response.on('data', function (chunk) {
process.stdout.write(".");
jsonstr += chunk;
});
response.on('end', function () {
process.stdout.write("DONE!");
console.log(' ');
console.log('Writing to file...');
fs.writeFile("data.json", jsonstr, function(err) {
if(err) {
return console.error('Error saving file');
}
console.log('The file was saved!');
});
});
})
request.on('error', function(e) {
console.log('Error downloading file: ' + e.message);
});
use JSON.parse
examine todaysRecords to look for whatever criteria match inside the structure
Can't help you there, but should be relatively straightforward to look for the correct object properties.
NOTE: Basically, what you get from the request is a string. You then parse that string with
var foo = JSON.parse(jsonstr)
In this case foo
is an object. The results "certificates" are actually inside the property result
, which is an array
var results = foo.result;
In this example the array contains about 1700 records and the structure of a certificate is something like this:
"actionType": "STC created",
"completedTime": "2015-06-02T21:51:26.955Z",
"certificateRanges": [{
"certificateType": "STC",
"registeredPersonNumber": 10894,
"accreditationCode": "PVD2259359",
"generationYear": 2015,
"generationState": "QLD",
"startSerialNumber": 1,
"endSerialNumber": 72,
"fuelSource": "S.G.U. - solar (deemed)",
"ownerAccount": "Solargain PV Pty Ltd",
"ownerAccountId": 25782,
"status": "Pending audit"
}]
So, to access, for instance, the "ownerAccount" of the first "certificateRanges" of the first "certificate" you would do:
var results = JSON.parse(jsonstr).result;
var ownerAccount = results[0].certificateRanges[0].ownerAccount;
The best way is to create an abstract structure (that meets your needs) and convert it to a csv.
There's a good npm library called json2csv that can help you there
Example:
var fs = require('fs');
var json2csv = require('json2csv');
var fields = ['car', 'price', 'color']; // csv titles
var myCars = [
{
"car": "Audi",
"price": 40000,
"color": "blue"
}, {
"car": "BMW",
"price": 35000,
"color": "black"
}, {
"car": "Porsche",
"price": 60000,
"color": "green"
}
];
json2csv({ data: myCars, fields: fields }, function(err, csv) {
if (err) console.log(err);
fs.writeFile('file.csv', csv, function(err) {
if (err) throw err;
console.log('file saved');
});
});
If you wish to append instead of writing to a new file you can use
fs.appendFile('file.csv', csv, function (err) { });