Search code examples
pythonpysparkdatabricks

Can I assign a literal value for a list of columns in pyspark?


Let's say I have a list1 = ["a", "b", "c"] and a random dataframe df. I want to add the columns a,b and c to my df, each having a constant value of 1 .Is there any way of doing this for the list of columns, instead of typing .withColumn('a', lit(1)) for each one?


Solution

  • For pyspark >= 3.3.0 you can use withColumns

    df = df.withColumns(dict.fromkeys(list1, F.lit(1)))
    

    Alternatively, if you have pyspark < 3.3.0 then we can use list comprehension to assign the columns

    df = df.select('*', *[F.lit(1).alias(c) for c in list1])