I've seen Hadoop-on-Demand, and the Hadoop integration on SGE. My understanding is that requires admin privileges, which I don't have on the big cluster at work. The admins have their hands full and won't be able to set us up for months.
I recognizing the limits a transient virtual cluster puts on the the utility of HDFS. I also understand how using a lustre file system goes against the grain, but has anyone written either SGE or Torque (PBS) scripts to submit a job to a cluster that starts up a hadoop instance?
See MyHadoop: http://www.sdsc.edu/~allans/MyHadoop.pdf
Bad link. Article available here: http://archive.futuregrid.org/sites/default/files/myHadoop.pdf