I am trying to create a table in Glue catalog with s3 path location from spark running in EMR using hive. I have tried the following commands, but getting the error:
pyspark.sql.utils.AnalysisException: u'java.lang.IllegalArgumentException: Can not create a Path from an empty string;'
sparksession.sql("CREATE TABLE IF NOT EXISTS abc LOCATION 's3://my-bucket/test/' as (SELECT * from my_table)")
sparksession.sql("CREATE TABLE abcSTORED AS PARQUET LOCATION 's3://my-bucket/test/' AS select * from my_table")
sparksession.sql("CREATE TABLE abcas SELECT * from my_table USING PARQUET LOCATION 's3://my-bucket/test/'")
Can someone please suggest the parameters that I am missing?
The issue happens when a database is created without specified location:
CREATE DATABASE db_name;
To fix the issue, specify location when create database:
CREATE DATABASE db_name LOCATION 's3://my-bucket/db_path';
Then, create a table:
USE db_name;
CREATE TABLE IF NOT EXISTS abc LOCATION 's3://my-bucket/db_path/abc' as (SELECT * from my_table)