I have a sparse dataframe and would like to get all non empty values as a single column. See the image that I made up to illustrate the problem. I somehow managed to solve it using the python code below. However, I feel there migh be some better | simpler | efficient way to solve it
import pandas as pd
list1 = ["x1","x2","?","?","?","?"]
list2 = ["?","?","y1","y2","?","?"]
list3 = ["?","?","?","?","z1","z2"]
df_sparse = pd.DataFrame({"A":list1,"B":list2,"C":list3})
values_vect = []
for col in df_sparse.columns:
values = [ i for i in list(df_sparse[col]) if i !="?"]
values_vect.extend(values)
df_sparse["D"] = pd.DataFrame(values_vect,columns=["D"])
display(df_sparse)
df_sparse["D"] = df_sparse.replace("?", np.nan).ffill(axis="columns").iloc[:, -1]
to get
>>> df_sparse
A B C D
0 x1 ? ? x1
1 x2 ? ? x2
2 ? y1 ? y1
3 ? y2 ? y2
4 ? ? z1 z1
5 ? ? z2 z2