apache-spark pyspark apache-spark-sql spark-jdbc

how to use pyspark writing to JDBC without column name

My question is really really simple.

I'm use pyspark to export a hive table to SQL Server.

I found I exported column names as lines in the SQL Server.

I just want to do it without column names.

I don't want these columns in tables...

My pyspark code here:

df.write.jdbc("jdbc:sqlserver://10.8.12.10;instanceName=sql1", "table_name", "overwrite", {"user": "user_name", "password": "111111", "database": "Finance"})

Is there an option to skip column names?

Solution

I think the JDBC connector isn't actually what adds those header lines. The header is already present in your Dataframe, it's a known problem when reading data from Hive table.

If you're using SQL to load data from Hive, you can try filtering the header with condition col != 'col':

# adapt the condition by verifiying what is in  df.show()    
df = spark.sql("select * from my_table where sold_to_party!='Sold-To Party'")

How to create a copy of a dataframe in pyspark?
Read previous Spark APIs
Unexpected output from least (source data includes nulls)
How to use PySpark UDF in Java / Scala Spark project
How does spark load python package depends on the external library?
Disable PySpark to print info when running
PySpark: How To Deserialise A Proto Payload From A Kafka Message With Variable Message Type
Multiple Sinks Processing not persisting in Databricks Community Edition
How to find longest sequence of consecutive dates?
graph.triplets seems not work as expected
PySpark MongoDB :: java.lang.NoClassDefFoundError: com/mongodb/client/model/Collation
How do I access the fields within a VARIANT column while reading from Kafka using Spark?
pyspark: how to specify rebalance partitioning hint with columns
Is Python UDF still inefficient in Spark?
How to import AnalysisException in PySpark
Updated scalapb class fails to render old dataframe
Create a Column with Values Based on an Array of Column Names Provided in Another Column
How to join on multiple columns in Pyspark?
Databricks: Issue while creating spark data frame from pandas
How to use SparkSQLparse in a simple FROM analysis?
UnsatisfiedLinkError while writing to S3 using Staging S3A Committer on Windows
How to install postgresql in my docker image?
Why Spark won't store Broadcasted data in off heap memory? Why does it store one copy per executor?
Are Parquet files highly structured or semi structured?
Is it possible to register the dataFrame as a SQL temporary view on spark structured streaming dataframe?
How to cast an array of struct in a spark dataframe using selectExpr?
convert columns of pyspark data frame to lowercase
What are the various join types in Spark?
Reading multiple Parquet files in PySpark notebook
spark dataframe groupping does not count nulls