Search code examples
rtransformtidyrreshape2

Reshape a dataframe continaing nominal data where value becomes column prefix


Greeting colleagues, I am struggling with a basic problem, but cannot seem to over come my issue. I have a data set that is structured as:

Reference | Person | DOB       |Status      | Address
------------------------------------------------------
0001/xy   | 001    |19/01/1960 | Respondent |123 Fake Street
0001/xy   | 002    |01/06/1978 | Defendant  |555 High Ave.
0002/xy   | 001    |31/04/1988 | Respondent |432 Random Dr.
0002/xy   | 002    |14/07/1991 | Defendant  |666 Missing Close
0002/xy   | 003    |10/10/2010 | Defendant  |987 McFakerton Blvd.
0003/xy   | 001    |08/02/1995 | Respondent |911 Crime Street

Reference is the unique key for an "incident" What I need is a single row, wide data set, in which the column to be pivoted is Status and the new columns contain the prefix from the values of the column. So it should look like;

Reference | Person | Respondent1_DOB | Respondent1_Address |Defendent1_DOB | Defendent1_Address |Defendent2_DOB | Defendent2_Address 
-------------------------------------------------------------------------------------------------------------------------------------
0001/xy   | 001    |19/01/1960       |123 Fake Street      |01/06/1978     |555 High Ave        |               |
0002/xy   | 001    |31/04/1988       |432 Random Dr.       |14/07/1991     |666 Missing Close   |10/10/2010     |987 McFakerton Blvd 
0003/xy   | 001    |08/02/1995       |911 Crime Street     |               |                    |               |

As you can see it is a simple task, but all reshape functions I have tried, including tidyr and Reshape2continue to try to summarise the data which removes the nominal characteristics, in addition to not duplicating the column headers for the provided values, or fail completely.

My currently tried, and failed code included:

Trans<-transform(reshape(Original, direction='wide', 
                                varying=list(3,5))[-1], id=Status)

Trans<- dcast(Original, Reference ~ Status, 
               value.var = c("DOB","Address"))

Trans<- Original %>% pivot_wide(Reference ~ Status,names_preifx=Status )

Is there a simple way to do this in R? I could do it in Python, but I am trying to stick to the one language and IDE as that is what the organisation has at the moment.


Solution

  • you can use base r reshape function:

    reshape(df[-4],timevar = "Person",idvar = "Reference",direction = "wide")
    
      Reference    DOB.001      Address.001    DOB.002       Address.002    DOB.003          Address.003
    1   0001/xy 19/01/1960  123 Fake Street 01/06/1978     555 High Ave.       <NA>                 <NA>
    3   0002/xy 31/04/1988   432 Random Dr. 14/07/1991 666 Missing Close 10/10/2010 987 McFakerton Blvd.
    6   0003/xy 08/02/1995 911 Crime Street       <NA>              <NA>       <NA>                 <NA>