When I do a simple http.get
for a URL that goes to a SquareSpace (SS) site I'm getting a 403 message. I know the site is working and that the server can reach it. Here's a simple example against a SS site (not mine, but produces the same issue):
Show that server can access site:
curl http://www.letsmoveschools.org
This returns all the HTML from the site...
Node app
var http = require('http');
var url;
url = 'http://www.letsmoveschools.org/';
var req = http.get(url, function(res) {
res.on('data', function(chunk) {
//Handle chunk data
});
res.on('end', function() {
// parse xml
console.log(res.statusCode);
});
// or you can pipe the data to a parser
//res.pipe(dest);
});
req.on('error', function(err) {
// debug error
console.log('error');
});
When I run the node app now node app.js
it outputs the 403
status code.
I have tried this code with other sites and it works fine, just not against squarespace sites. Any idea of either configuration on SS or something else I need to do in Node?
The problem is that the remote server is expecting/requiring a User-Agent
header and node does not send such headers automatically. Add that and you should get back a 200 response:
// ...
url = 'http://www.letsmoveschools.org/';
var opts = require('url').parse(url);
opts.headers = {
'User-Agent': 'javascript'
};
var req = http.get(opts, function(res) {
// ...