Search code examples
rsql-serverdatabasessisssas

Data Wrangling for Analysis Services Cube


I have a data set that is as follows (simplified):

Fund    Field1-2012    Field1-2013    Field2-2012   Field2-2013
 FD1       x               x               x            x
 FD2       x               x               x            x

As you can see, dates exist with the fields making this very unfriendly for most analysis. What want is the following

Fund   Year   Field1   Field2
 FD1    2012   x          x
 FD1    2013   x          x
 FD2    2012   x          x
 FD2    2013   x          x

I have been using SQL server integration tools to accomplish this but to no avail. Is there a tool I should be using or is there something in excel that can help me out? Not possible to brute force as the dataset is quite large

Best


Solution

  • You have an R tag, so here's an R solution:

    df = read.table(text = "
    Fund    Field1-2012    Field1-2013    Field2-2012   Field2-2013
    FD1       5               7               9            10
    FD2       6               8               9            10
    ", header=T)
    
    library(tidyverse)
    
    df %>%
      gather(key, value, -Fund) %>%
      separate(key, c("type","year"), convert = T) %>%
      spread(type, value)
    
    #   Fund year Field1 Field2
    # 1  FD1 2012      5      9
    # 2  FD1 2013      7     10
    # 3  FD2 2012      6      9
    # 4  FD2 2013      8     10