As a proof of concept, I'm building this extremely simple Twitter Friends crawler. Here's what it will do:
Here's what my code looks like so far:
def main( args:Array[String] ) {
scalar {
grid.execute(classOf[CrawlTask], "twitter-user-1").get
}
}
class CrawlTask extends GridTaskNoReduceSplitAdapter[String] {
def split( gridSize:Int, arg:String): Collection[GridJob] = {
val jobs:Collection[GridJob] = new ArrayList[GridJob]()
val initialCrawlJob = new CrawlJob()
initialCrawlJob.twitterId = arg
jobs.add(initialCrawlJob)
jobs
}
}
class CrawlJob extends GridJob {
var twitterId:String = new String()
def cancel() = {
println("cancel - " + twitterId)
}
def execute():Object = {
println("fetch friends for - " + twitterId)
// Fetch and execute CrawlJobs for all friends
return null
}
}
I have Java services prepared for all twitter interaction. Need some examples to figure out how to create new jobs within an existing job and associate it with the original Task.
Thanks | Srirangan
How did I get around this?
Conceptually unite GridTasks and GridJobs. MySpecialGridTask can only have one MySpecialGridJob.
Then, it is easy to execute new GridTasks in the Task or the Job.
In the example above:
class CrawlJob extends GridJob {
var twitterId:String = new String()
def cancel() = {
println("cancel - " + twitterId)
}
def execute():Object = {
println("fetch friends for - " + twitterId)
// Fetch and execute CrawlJobs for all friends
// Execute Job Here
grid.execute(classOf[CrawlTask], "twitter-user-2").get
grid.execute(classOf[CrawlTask], "twitter-user-3").get
return null
}
}