Search code examples
amazon-web-servicesweb-scrapingircserverless

Is AWS severless app able to keep listening to IRC channels?


I am working on a project using AWS serverless express to get messages from IRC channels. I used node.js(v10) and node-irc package.

It works in this way: I POST the channel name to the backend, it will join the IRC channel and do sth. Here is some of my code:

router.post('/message', function(req, res) {
  console.log(req.body);
  var channel_name = req.body.channel_name
  var channel_id = req.body.channel_id;
  var irc = require('irc')
      , logger =  function logger() {
            console.log(arguments);
        }
      , instance
      ;
    var channelTag = '#' + channel_name; 
    process.on('unhandledRejection', logger);
    process.on('uncaughtException',  logger);

    try {
        instance = new irc.Client('irc.server.name', 'username', {
            userName: 'username'
          , realName: 'realname'
          , password: 'pwd'
          , channels:[channelTag]
          , port: 6667
          , autoRejoin: true
          , autoConnect: false
          , secure: false
          , selfSigned: false
          , certExpired: false
          , stripColors: true
          , encoding: 'UTF-8'
          , debug: true
        });
        console.log('instance')
        instance.connect();
        instance
            .addListener('message', function (from, to, message) {
              console.log('log: ', message)
            });
    }
    catch (ex) {
        logger(ex);
    }
  console.log('mark', channel_name)
  res.send(channel_name)
})

When I ran it on my own laptop (npm start), it works pretty well. It will console log IRC messages.

But when I use SAM (AWS Serverless Application Model) and run sam local start-api to test it locally, it may only run the code for 1 second and can't keep listening to the channel.

Fetching lambci/lambda:nodejs8.10 Docker container image......
2019-07-28 10:58:53 Mounting /Users/apple/Public/basic-starter-2 as /var/task:ro,delegated inside runtime container
START RequestId: dba68ff9-9cc8-194d-8aaf-8cf60770eb8c Version: $LATEST
2019-07-28T17:58:58.249Z        dba68ff9-9cc8-194d-8aaf-8cf60770eb8c    { channel_id: '181866881', channel_name: 'sae_jin' }
2019-07-28T17:58:58.388Z        dba68ff9-9cc8-194d-8aaf-8cf60770eb8c    instance
2019-07-28T17:58:58.393Z        dba68ff9-9cc8-194d-8aaf-8cf60770eb8c    28 Jul 17:58:58 - SEND: #sae_jin lol
2019-07-28T17:58:58.394Z        dba68ff9-9cc8-194d-8aaf-8cf60770eb8c    mark  sae_jin
END RequestId: dba68ff9-9cc8-194d-8aaf-8cf60770eb8c
REPORT RequestId: dba68ff9-9cc8-194d-8aaf-8cf60770eb8c  Duration: 3632.46 ms    Billed Duration: 3700 ms        Memory Size: 1024 MB     Max Memory Used: 58 MB
2019-07-28 10:58:58 127.0.0.1 - - [28/Jul/2019 10:58:58] "POST /scrape HTTP/1.1" 200 -

I was wondering whether it is because AWS serverless app is not able to do it. If it is the reason, are there any other options? Do I need to use EC2 instead?


Solution

  • You should not use Serverless for tasks that require continuous running. It is not designed for that kind of use cases, and there are multiple practical reasons for why it is a bad idea:

    • The runtime of a single Lambda invocation is capped at 15 minutes
    • The lifetime of a single Lambda container (as they are reused across invocations) is somewhere in the 6-10 hour range
    • Lambda has a pricing model based on amount of invocations and amount of runtime. The invocation cost is extremely small, but the runtime cost is very high compared to alternatives. Using a "continuously re-invoked" Lambda function is very cost ineffective

    The practical reason why your code only runs for a short while is that when the function returns the response, code execution is suspended. You cannot have a running "background" Lambda function.

    An EC2 instance is the baseline solution, but a more modern solution utilizing containers without the need of provisioining your own servers would be AWS Fargate