Using Scala here:
Val df = spark.read.format("jdbc").
option("url", "<host url>").
option("dbtable", "UPPERCASE_SCHEMA.table_name").
option("user", "postgres").
option("password", "<password>").
option("numPartitions", 50).
option("fetchsize", 20).
load()
The database I'm using the above code to call from has many schemas and they are all in uppercase letters (UPPERCASE_SCHEMA).
No matter how I try to denote that the schema is in all caps, Spark converts it to lowercase which fails to initialize with the actual DB.
I've tried making it a variable and explicitly denoting it is all uppercase, etc. in multiple languages, but no luck.
Would anyone know a workaround?
When I went into the actual DB (Postgres) and temporarily changed the schema to all lowercase, it worked absolutely fine.
Try to set spark.sql.caseSensitive
to true
(false
by default)
spark.conf.set('spark.sql.caseSensitive', true)
You can see in the source code its definition: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L833
In addition, you can see in the JDBCWriteSuite
how it affects the JDBC connector:
https://github.com/apache/spark/blob/ee95ec35b4f711fada4b62bc27281252850bb475/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCWriteSuite.scala