Search code examples
rdplyrtidyversetidyrspread

Spread dataframe


I have the following dataframe/tibble sample:

structure(list(name = c("Contents.Key", "Contents.LastModified", 
"Contents.ETag", "Contents.Size", "Contents.Owner", "Contents.StorageClass", 
"Contents.Bucket", "Contents.Key", "Contents.LastModified", "Contents.ETag"
), value = c("2019/01/01/07/556662_cba3a4fc-cb8f-4150-859f-5f21a38373d0_0e94e664-4d5e-4646-b2b9-1937398cfaed_2019-01-01-07-54-46-064", 
"2019-01-01T07:54:47.000Z", "\"378d04496cb27d93e1c37e1511a79ec7\"", 
"24187", "e7c0d260939d15d18866126da3376642e2d4497f18ed762b608ed2307778bdf1", 
"STANDARD", "vfevvv-edrfvevevev-streamed-data", "2019/01/01/07/556662_cba3a4fc-cb8f-4150-859f-5f21a38373d0_33a8ba28-245c-490b-99b2-254507431d47_2019-01-01-07-54-56-755", 
"2019-01-01T07:54:57.000Z", "\"df8cc7082e0cc991aa24542e2576277b\""
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

I want to spread the names column using tidyr::spread() function but I don't get the desired result

df %>% tidyr::spread(key = name, value = value)

I get an error:

Error: Duplicate identifiers for rows:...

Also tried with melt function same result.

I have connected to S3 using aws.s3::get_bucket() function and trying to convert it to dataframe. I am aware there is a aws.s3::get_bucket_df() function which should do this but it doesn't work (you may look at my relevant question.

After I've got the bucket list, I've unlisted it and run enframe command. Please advise.


Solution

  • You can introduce a new column first(introduces NAs, will have to deal with them).

    df %>% 
    mutate(RN=row_number()) %>% 
      group_by(RN) %>% 
      spread(name,value)