Search code examples
rmultivariate-testingdata-transform

Quick way to transform data for NMDS in R?


I have a dataframe with 3 variables: ID, Taxa, and EstimatedNumber. I'm looking for an easy way to transform this data so it is ready for an NMDS. Essentially I want ID to remain as the first column, but then each subsequent column is each level in the factor Taxa. Finally, the values in EstimatedNumber fill in the cells within the matrix.

Here's a subset of my data.

structure(list(FishID = structure(c(50L, 50L, 51L, 52L, 52L, 
55L, 55L, 55L, 55L, 55L, 56L, 56L, 67L, 67L, 67L, 70L, 70L, 65L, 
65L, 71L), .Label = c("SSM002", "SSM004", "SSM005A", "SSM005B", 
"SSM006", "SSM007", "SSM009", "SSM012", "SSM013", "SSM014", "SSM016", 
"SSM017", "SSM018", "SSM019", "SSM020", "SSM021", "SSM022", "SSM023", 
"SSM024A", "SSM024B", "SSM025", "SSM026", "SSM027", "SSM030", 
"SSM031", "SSM032", "SSM033", "SSM034", "SSM035", "SSM036", "SSM037", 
"SSM038", "SSM039", "SSM040", "SSM041", "SSM043", "SSM044", "SSM045", 
"SSM046", "SSM047", "SSM048", "SSM052", "SSM053", "SSM054", "SSM055", 
"SSM056", "SSM057", "SSM058", "SSM059", "SSS001", "SSS002", "SSS003", 
"SSS004", "SSS005", "SSS006", "SSS007", "SSS008", "SSS009", "SSS010", 
"SSS011", "SSS012", "SSS013", "SSS014", "SSS015", "SSS016", "SSS017A", 
"SSS017B", "SSS018", "SSS019", "SSS020", "SSS022"), class = "factor"), 
    Taxa = c("Onisimus", "Gammarus", "Unidentified", "Fish", 
    "Amphipods", "Gammarus", "Onisimus", "Gammarus", "Jellyfish", 
    "Unidentified", "Onisimus", "Unidentified", "Onisimus", "Unidentified", 
    "Gammarus", "Onisimus", "Fish", "Onisimus", "Jellyfish", 
    "Fish"), EstimatedNumber = c(1305L, 103L, NA, 1L, NA, 3L, 
    4L, 4L, 1L, NA, 32L, NA, 45L, NA, 1L, 1122L, 12L, 3L, 8L, 
    8L)), row.names = c(NA, 20L), class = "data.frame")

Here's an example of what I'm looking for.

   FishID Onisimus Gammarus
1  SSS001     1305      103
2  SSS002        0        0
3  SSS003        0        0
4  SSS006        4        3
5  SSS007       32        0
6 SSS017B       45        1

Solution

  • Using the package reshape2 :

    df_reshaped <-  reshape2::dcast(df,FishID ~ Taxa,value.var="EstimatedNumber",fun.aggregate = sum)
    

    Pay attention that you have two SSS006 X gammarus and NAs in your database.