Search code examples
apache-sparkpysparkazure-databricks

How to check if schema of two dataframes are same in pyspark?


I have dataframe df1 which has columns a, b, c, d with int, int, int, int as corresponding datatypes

I have dataframe df2 which has columns a, b, e, c, d with int, int, string, int, int as corresponding datatypes

I should be able to find whether these two dataframes hold same schemas or not, which in this case is not same. How do I do it in an easier way in pyspark?


Solution

  • put the map dtypes to sets and find the intersection or the difference

    len(set(df.dtypes).difference(set(df1.dtypes)))==0