Search code examples
hadoopmapreducerecordreader

How does mapper run() method process the last record?


public void run(Context context) throws IOException, InterruptedException 
{
setup(context);

while (context.nextKeyValue()) {
map(context.getCurrentKey(), context.getCurrentValue(), context);
 }
cleanup(context);
}

With above snippet when the mapper's run method is called, everytime it gets the next key,value pair by nextkeyvalue() function from recordreader and process the current key,value pair. So in that case if we are processing the last record of a particular inputsplit the nextkeyvalue() function would return false and we would not be missing the last record in every inputsplit?


Solution

  • nextKeyValue() either advances to the next key/value and returns true, or has reached the end and returns false. Therefore when nextKeyValue() returns true for the final time getCurrentKey() and getCurrentValue() will get the final key/value for the split.