Search code examples
exceptionconfigurationhadoopjobs

Hadoop jobs not killed


My jobs have some exception before map-reduce steps, but jobs are not getting killed. How to configure hadoop such that jobs get killed after exception?

Invoking Main class now

Heart beat Heart beat

Invocation of Main class completed

Oozie Launcher ends

stderr logs

org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Io exception: Unknown host specified )
    at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82)
    at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:577)
    at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:792)
    at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:815)
    at com.seven.crcs.export.dao.ReportDAOImpl.recreateReportEntity(ReportDAOImpl.java:151)
    at com.seven.crcs.export.dao.ReportDAOImpl.saveActiveUserCount(ReportDAOImpl.java:93)
    at com.seven.crcs.export.ReportJdbcExporter.saveActiveUserCount(ReportJdbcExporter.java:55)
    at com.seven.dataprocessor.oc.jobs.reports.export.day.ExportDailyUserReducer.exportUserCounts(ExportDailyUserReducer.java:32)
    at com.seven.dataprocessor.oc.jobs.reports.export.ExportActiveUser
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Io exception: Unknown host specified )

And

2013-02-28 06:06:46,487 INFO org.apache.hadoop.mapred.JobClient: Task Id : attempt_201302270945_0181_r_000000_0, Status : FAILED
2013-02-28 06:07:00,600 INFO org.apache.hadoop.mapred.JobClient: Task Id : attempt_201302270945_0181_r_000000_1, Status : FAILED
2013-02-28 06:07:16,650 INFO org.apache.hadoop.mapred.JobClient: Task Id : attempt_201302270945_0181_r_000000_2, Status : FAILED
2013-02-28 06:07:31,731 INFO org.apache.hadoop.mapred.JobClient: Job complete: job_201302270945_0181

But jobs complete SUCCEEDED


Solution

  • Your job was actually terminated, but only after 3 failed attempts of the map task as the task ids show:

    • attempt_201302270945_0181_r_000000_0
    • attempt_201302270945_0181_r_000000_1
    • attempt_201302270945_0181_r_000000_2

    You can limit the number of maximum attempts for each task either by setting the parameter mapred.map.max.attempts to 1 or by using JobConf#setMaxMapAttempts(int)JobConf#setMaxMapAttempts.

    This will cause your map task to fail on the first exception and thus terminate your job a little faster.