Search code examples
google-app-enginechannelpoolingchannel-api

Best Method of Channel Pooling in Google App Engine


It seems the only way to make the GAE Channel API financially viable is to implement some kind of pooling mechanism (one of the senior app engine product managers even told me this when I emailed them about the exorbitant price) to reuse channels that have not yet expired.

I've been brainstorming ways (places) to implement a channel pool, but each method I think of has some pretty serious drawbacks.

Static memory of a Servlet -- Good, but will drop quite a bit of open channels when a new VM instance opens and/or a client gets passed from one VM to another.

Memcache -- At least the memory is globally accessible from all VMs, but now the possibility of dropping a very viable channel is possibly greater due to inactivity and memory pressure.

Backend Instance -- Probably the best option in terms of reliability, but now the expense of running the backend will eat up all the savings of implementing the pool in the first place!

Is there a better place/way of implementing a channel pool across VMs that I'm missing, or am I unnecessarily hung up on the drawbacks of my options here? I really hope there is, or it looks like my app will have to revert to polling (which is looking marginally cheaper in my preliminary metrics).


Solution

  • Here's what I'd do (I'm actually considering writing this library after seeing your question. I need it too):

    Create a taskpool module with the following API.

    client_id, token = taskpool.get()
    
    # Setup a heartbeat in the client JS, maybe every minute. 
    # Also call this every time the client indicates presence
    taskpool.ping(client_id)
    
    taskpool.release(client_id)
    

    Implementation:

    • Store the client_id and token in an entity, with a status indicating whether it's being used, last ping time, and creation time. Let the client_id be the key. Also consider using NDB. Free memcaching.

    get() checks if there are unused tokens and returns one if it finds it. Otherwise create a new one, store and return it.

    ping() updates the last ping time for that token. Instead of polling, let the client send in a ping every [heartbeat] time.

    release() marks the token as unused.

    Run a task / cron every [heartbeat] seconds to find the tokens which haven't gotten a ping in a while - and set them as unused.

    When clients report a closed token, perform a get().

    Keep in mind, though, that a loss in security is a by-product of any kind of token pooling. If a malicious client has held on to a token and stopped sending heartbeats, it might later be able to listen in on the messages being passed to the new client once the token is re-purposed. This isn't a problem if you're on a fully public site, but keep it in mind anyway.

    I will update this answer if and when I write this up as a library.