Hive is not handling integer values properly when loading text data into table

I was loading some text data into Apache Hive containing int columns. It was storing null values at unexpected places. So, I ran some tests:

create table testdata (c1 INT, c2 FLOAT) row format delimited fields terminated by ',' stored as textfile;
load data local inpath "testdata.csv" overwrite into table testdata;
select * from testdata;

testdata.csv contains this data:

1,1.0
 1, 1.0
1 ,1.0
 1 , 1.0

As you can see, dataset contains some extra whitespace around numbers. But this is causing hive to store null values in integer columns, while float is being parsed correctly.

Select query output:

1       1.0
NULL    1.0
NULL    1.0
NULL    1.0

Why this is happening so, and how to correctly handle these cases?

Solution

You can not do it in one step. First load the data as string in stg table and then load into final table from stg table by removing space.

Create and load table like below.

create table testdata (c1 string, c2 string) row format delimited fields terminated by ',' stored as textfile;
create table stgtestdata as select * from testdata;
load data local inpath "testdata.csv" overwrite into table stgtestdata;

Use insert to load into final table by trimming space and convert properly like below

Insert overwrite testdata
select 
Cast(trim(c1) as int) as c1,
Cast(trim(c2) as float) as c2
from stgtestdata;