I have an interesting situation whereby I need to pivot (if that's the actual term) some data but effectively on only half of the dataset eg
Current example dataset
Name--------Address----------DOB----Ind1----Ind2----Ind3
Joe Bloggs---2 Fleet Street-----270784--7--------25.6-----14
Pete Gribby--32 Egg Road------050577-13-------21.6-----11
Required output
Name--------Address---------DOB-----Value-----ColName
Joe Bloggs---2 Fleet Street----270784---7---------Ind1
Joe Bloggs---2 Fleet Street----270784---25.6------Ind2
Joe Bloggs---2 Fleet Street----270784---14--------Ind3
Pete Gribby--32 Egg Road-----050577---13--------Ind1
Pete Gribby--32 Egg Road-----050577---21.6------Ind2
Pete Gribby--32 Egg Road-----050577---11--------Ind3
I'm not usually a Python person, in fact have very little experience with it, but due to the technology I'm currently using I can't use .Net Spark.
I attempted to mock something up in C# to get a feel of the process required, mainly taking the "Ind" columns, pivoting them then creating a new dataset with the repeating data, but that just got ugly.
I've also had a look at the explode function though similar questions on here, but can't quite see how to get things moving in this example
So my question is, does anyone have any points/advice/code snippets in Python to help with this conundrum?
Regards
You use below unpivot expression for your required results.
df.select("Name", "Address", "DOB",expr("stack(3,'Ind1',cast(Ind1 as double),'Ind2', cast(Ind2 as double),'Ind3',cast(Ind3 as double)) as (ColName,Value)"))