I am trying to add a new String column to a dataframe with a default value of null (a non-null value will be applied later)
Here is my code
.withColumn("column-name", lit(null: String))
This creates a column with the Void type which I do not want
What is the easiest way to create a column of type String with null default value?
Note, the structure of the set of jobs is set in stone, and I am leaving this company very soon, so I am not interesting in arguing that the code should be restructured, I just want to give them the code they have asked for with the least fuss
Note also we aren't using a code-defined schema anywhere, it is pure schema inference
You can use lit
with null
, then cast it to your desired type.
Example
df.withColumn("test", lit(null).cast(StringType))
Output
+---+----+
|id |test|
+---+----+
|1 |null|
|2 |null|
|3 |null|
+---+----+
Schema
root
|-- id: integer (nullable = false)
|-- test: string (nullable = true)
Good luck!