I want to hold File A in the memory of reducer1 and File B in the memory of reducer2. Is this possible using Distributed Cache technology in hadoop? Or else, is there any other way to acheive this?
Thanks
Yes if the files are considerably small you can set these files in distributed cache. Follow this link http://developer.yahoo.com/hadoop/tutorial/module5.html#auxdata. It might be useful to u.
And if you consider this portion of the code its up to u which file u want to work upon in which reducer.
Path [] cacheFiles = DistributedCache.getLocalCacheFiles(conf);
if (null != cacheFiles && cacheFiles.length > 0) {
for (Path cachePath : cacheFiles) {
if (cachePath.getName().equals(stopwordCacheName)) {
loadStopWords(cachePath);
break;
}
}
See if it helps