Search code examples
pythonhttpproxynode.jshttp-proxy

Creating a http-proxy that can modify the http response before sending it to the client


I'm using wget to grab a something from the web, but I don't want to follow a portion of the page. I thought I could set up a proxy that would remove the parts of the webpage I didn't want to be processed, before returning it to wget but I'm not sure how I would accomplish that.

Is there a proxy that lets me easily modify the http response in python or node.js?


Solution

  • There are several ways you could achieve this goal. This should get you started (using node.js). In the following example I am fetching google.com and replacting all instances of "google" with "foobar".

    // package.json file...
    {
      "name": "proxy-example",
      "description": "a simple example of modifying response using a proxy",
      "version": "0.0.1",
      "dependencies": {
        "request": "1.9.5"
      }
    }
    
    // server.js file...
    var http = require("http")
    var request = require("request")
    var port = process.env.PORT || 8001
    
    http.createServer(function(req, rsp){
      var options = { uri: "http://google.com" }
    
      request(options, function(err, response, body){
        rsp.writeHead(200)
        rsp.end(body.replace(/google/g, "foobar"))
      })
    
    }).listen(port)
    
    console.log("listening on port " + port)