What is the difference between this:
CREATE TABLE dbo.table1(
id INT,
firstname STRING,
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS /usr/hive/sample_data.csv
;
and
CREATE TABLE dbo.table1(
id INT,
firstname STRING
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
;
then
LOAD DATA INPATH '/usr/hive/sample_data.csv' INTO dbo.table1;
First create sentence is completely wrong one: STORED AS should be TEXTFILE, ORC, Parquet, etc, it is not location and of course when you create table you should not provide file name. Tables in hive are created on locations(folders), not files and the property for tables location is LOCATION, not STORED AS. See recent example: https://stackoverflow.com/a/68095278/2700344
Second create sentence creates table without location specified (default loacation will be used for managed tables, like this /user/hive/warehouse/dbo/table1 ), see here more details https://stackoverflow.com/a/67073849/2700344
Execute DESC FORMATTED dbo.table1
and check LOCATION.
Yes, you need load data to be executed because your file is located not in the table location. If you place file into some dedicated location for that table, you can CREATE EXTERNAL TABLE and specify LOCATION. But your file is currently in such folder which should not be used as a table location: /usr/hive. This /usr/hive/table1 looks much better. Alternatvely you can create table like in second CREATE sentence and just copy file into it's location using hadoop fs cp
command. LOAD DATA does the same.