Search code examples
scalamapreducegridgain

GridGain / Scala - Generate Jobs within existing Job


As a proof of concept, I'm building this extremely simple Twitter Friends crawler. Here's what it will do:

  1. Execute CrawlJob for Twitter account "twitter-user-1"
  2. Find all friends of "twitter-user-1"
  3. Execute CrawlJob for all friends of "twitter-user-1"

Here's what my code looks like so far:

def main( args:Array[String] ) {

  scalar {
    grid.execute(classOf[CrawlTask], "twitter-user-1").get
  }

}

class CrawlTask extends GridTaskNoReduceSplitAdapter[String] {

    def split( gridSize:Int, arg:String): Collection[GridJob] = {
        val jobs:Collection[GridJob] = new ArrayList[GridJob]()
        val initialCrawlJob = new CrawlJob()
        initialCrawlJob.twitterId = arg
        jobs.add(initialCrawlJob)
        jobs
    }

}

class CrawlJob extends GridJob {

  var twitterId:String = new String()

  def cancel() = {
    println("cancel - " + twitterId)
  }

  def execute():Object = {
    println("fetch friends for - " + twitterId)
    // Fetch and execute CrawlJobs for all friends
    return null
  }

}

I have Java services prepared for all twitter interaction. Need some examples to figure out how to create new jobs within an existing job and associate it with the original Task.

Thanks | Srirangan


Solution

  • How did I get around this?

    Conceptually unite GridTasks and GridJobs. MySpecialGridTask can only have one MySpecialGridJob.

    Then, it is easy to execute new GridTasks in the Task or the Job.

    In the example above:

    class CrawlJob extends GridJob {
    
      var twitterId:String = new String()
    
      def cancel() = {
        println("cancel - " + twitterId)
      }
    
      def execute():Object = {
        println("fetch friends for - " + twitterId)
        // Fetch and execute CrawlJobs for all friends
        // Execute Job Here
        grid.execute(classOf[CrawlTask], "twitter-user-2").get
        grid.execute(classOf[CrawlTask], "twitter-user-3").get
        return null
      }
    
    }