Search code examples
node.jslong-polling

multi-client long poll handling with node


I am trying to create a node server that will notify long-polling clients when a file is updated on the server. However, I cannot seem to come up with the code to have the server recognize a change in the file and update any polling client that an update has been performed.

My source of confusion is rooted in the handling of timestamp, specifically, which timestamps I should be tracking. The clients just want to know if a file has changed. I don't think we should have to care about when a request comes in, i.e. we don't have to store the timestamp of the request in order to determine a change to the file. But, we would need to track in order to determine when to expire the request.

So, I'm thinking the server receives a request and immediately stores the timestamp of the target file. Then, every 10 seconds the server checks the stored timestamp against the timestamp of the file at the current time. If there's a difference, the file has been updated, and the server sends down a response to the client that indicates the file has changed. If the server does not see a change in the file after, say, 60 seconds, it sends a response down to the client to indicate that the client should initiate a new request.

Does the strategy above make sense? How does one handle the timestamp stuff efficiently? And, how does one handle multiple clients at the same time? And, how does one prevent the server from being overrun by multiple requests from the same client?


Solution

  • You need to be careful with what happens while the client is initiating the new request as the file may change during this time window.

    One way to take care of this would be for the client to first query the current file status:

    GET /file/timestamp
         -> Server returns the timestamp
    
    GET /file/update?after=[timestamp]
         -> Server checks whether the file has changed after the timestamp.
            If it has, the server sends the response immediately.
            Otherwise insert the client into a queue.
            You don't need to save the timestamp in the queue.
            If the server notices a change it notifies the clients.
    

    Now because of multiple clients the server shouldn't do the polling in the client request handler. Instead have a separate object that handles the polling.

    Depending on whether you have one or more files that need to be watched, you might end up with a simple or complex implementation for this. In short though you'll probably want to wrap fs.watchFile in an EventEmitter so that changes to the file will emit change-events.

    A naive implementation would be:

    var watcher = new EventEmitter();
    
    // Get the initial status
    fs.lstat( 'test.file', function( err, stat ) {
        if( err ) return console.error( err );
        watcher.stat = stat;
    });
    
    // Start watching the file and emit an event whenever the file changes.
    fs.watchFile( 'test.file', function( curr, prev ) {
        console.log( 'File changed' );
        watcher.stat = curr;
        watcher.emit( 'change', curr );
    });
    

    With these in place your request handler will look like something along the lines of:

    var server = http.createServer( function( req, res ) {
    
        res.writeHead( 200, { 'Content-Type': 'text/html' });
    
        var timeout;
        var onChange = function( stat ) {
            // Prevent timeout from triggering
            clearTimeout( timeout );
    
            console.log( 'File changed at ' + stat.mtime );
            res.end(
                'Changed at ' + stat.mtime + '. ' +
                '<a href="?after=' + stat.mtime.getTime() + '">Poll again</a>' );
        };
    
        var after = url.parse( req.url ).query || '';
        after = after.split('=');
        console.dir( after );
        if( after.length < 2 || after[1] < watcher.stat.mtime.getTime() ) {
            console.log( 'Initial request or file changed before request' );
            return onChange( watcher.stat );
        }
    
        console.log( 'Polling request.' );
    
        watcher.once( 'change', onChange );
        timeout = setTimeout( function() {
            console.log( 'File not changed. Giving up.' );
            watcher.removeListener( 'change', onChange );
            res.end(
                'Unchanged! <a href="?after=' + after[1] + '">Poll again</a>' );
        }, 10000 );
    });
    

    And finally the "prevent the server from being overrun by multiple requests from the same client?" - you don't. Not if you want to guarantee this and still allow anonymous requests. You could try cookie-based exclusion but if your service allows anonymous requests the users can just stop sending cookies at which point it becomes really difficult to identify requests from the same browser.