I need to add a number of columns (2) into the data frame in pyspark. I am using the select statement:
df.select("*",[sha2(c,256).alias("hashed_"+c) for c in f_pseudo_cols])
Here I am selecting all columns and also I'm adding another 2 columns with the sha2() function.
In f_pseudo_cols I have 2 cols named "Swis code" and "Roll year" which is present in df(dataframe).
I am getting following error:
'Invalid argument, not a string or column: Row(field_name='Swis Code') of type . For column literals, use 'lit', 'array', 'struct' or 'create_map' function.'
I tried to convert it to String using the str()
function which is also not working.
This should work.
import sys
from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark import SparkContext, SQLContext
import pyspark.sql.functions as F
from pyspark.sql import Window
from pyspark import SparkContext, SQLContext
sc = SparkContext('local')
sqlContext = SQLContext(sc)
data1 = [
(10087, "BH", "L", "D"),
(10066, "BS", "B", "null"),
(10094, "BL", "L", "E"),
(10080, "BF", "B", "null")
]
df1Columns = ["ID","CODE","TYP","KIND"]
df1 = sqlContext.createDataFrame(data=data1, schema = df1Columns)
hexed_df1 = df1.select([F.sha2(F.lit(c),256).alias("hashed_"+c) for c in df1Columns])
print("hexed_df1 dataframe")
hexed_df1.show(truncate=False)
Output :
hexed_df1 dataframe
+----------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------------------------+
|hashed_ID |hashed_CODE |hashed_TYP |hashed_KIND |
+----------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------------------------+
|3843971dcfdee5083e6289e1bbdbb003e538b5a8a668fc43ae4f19d415ac18a2|07a9d7b4a9a23915a61bc89bb0357bf47b348cf4174eb965bb1df8fbfa18b0b5|3909713b1e306608da35d4e3b7d0a72cbe7bee7f99c041f134a233740a4e8ccd|ae97bccd529278e7c12624025e56b3034e5afca568f579f6ef5e04f900fef2bb|
|3843971dcfdee5083e6289e1bbdbb003e538b5a8a668fc43ae4f19d415ac18a2|07a9d7b4a9a23915a61bc89bb0357bf47b348cf4174eb965bb1df8fbfa18b0b5|3909713b1e306608da35d4e3b7d0a72cbe7bee7f99c041f134a233740a4e8ccd|ae97bccd529278e7c12624025e56b3034e5afca568f579f6ef5e04f900fef2bb|
|3843971dcfdee5083e6289e1bbdbb003e538b5a8a668fc43ae4f19d415ac18a2|07a9d7b4a9a23915a61bc89bb0357bf47b348cf4174eb965bb1df8fbfa18b0b5|3909713b1e306608da35d4e3b7d0a72cbe7bee7f99c041f134a233740a4e8ccd|ae97bccd529278e7c12624025e56b3034e5afca568f579f6ef5e04f900fef2bb|
|3843971dcfdee5083e6289e1bbdbb003e538b5a8a668fc43ae4f19d415ac18a2|07a9d7b4a9a23915a61bc89bb0357bf47b348cf4174eb965bb1df8fbfa18b0b5|3909713b1e306608da35d4e3b7d0a72cbe7bee7f99c041f134a233740a4e8ccd|ae97bccd529278e7c12624025e56b3034e5afca568f579f6ef5e04f900fef2bb|
+----------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------------------------+