Search code examples
f#unpivotdeedle

unpivot a deedle dataframe


The stack function of frame can turn a data frame like this

     cola colb      colc      
1 -> 10   <missing> aaa       
3 -> 20   5.5       bb        
5 -> 30   <missing> <missing> 
6 -> 40   <missing> ccc 

INTO

     Row Column Value 
0 -> 1   cola   10    
1 -> 1   colc   aaa   
2 -> 3   cola   20    
3 -> 3   colb   5.5   
4 -> 3   colc   bb    
5 -> 5   cola   30    
6 -> 6   cola   40    
7 -> 6   colc   ccc  

However normally it is needed to use one of the column value and the other column's heading as the join key for the new column while doing a unpivot. how can I achieve a result like:

0 -> 10   colb   <missing>    
1 -> 10   colc   aaa   
2 -> 20   colb   5.5    
3 -> 20   colc   bb   
4 -> 30   colb   <missing>    
5 -> 30   colc   <missing>        
6 -> 40   colb   <missing>            
7 -> 40   colc   ccc  

the original cola's value and the column headings for colb and colc have become a combined key to point to colb value and colc value.

How can I achieve this with Deedle?


Solution

  • I don't think we have any built-in function to do this automatically in Deedle, but you can do that by iterating over the rows of the frame and then iterating over the columns:

    Assuming f is the sample input frame from your question, the following should do the trick:

    [ for r in f.Rows.Values do
        for c in r.Keys do
          if c <> "cola" then 
            yield r.Get("cola"), c, r.TryGet(c) ]
    |> Frame.ofRecords