Search code examples
rr-haven

Can't convert dataframe to Stata using foreign or haven


I have a dataframe that I am trying to save as stata .dta file. I tried both haven and foreign packages but I get the fllowing errors:

write_dta(df,"C:/Users/../df.dta")

I get an error:

Error in write_dta_(data, normalizePath(path, mustWork = FALSE), version = stata_file_format(version),  : Variables of type list not supported yet

same with the foreign package:

write.dta(df,"C:/Users/../df.dta")

or

write.dta(data = df,file = "C:/Users/../df.dta")

error:

Error in write.dta(df, "C:/users/../df.dta") : 
  unknown data type

but when I check if the df is indeed a dataframe I get TRUE:

is.data.frame(df)
[1] TRUE

I thought of using writexl library and saving it as xlsx and then importing it to Stata:

write_xlsx(df,"C:/Users/../df.xlsx")

but in the resulting excel file, all the count columns (please see the attached image) are empty. so only columns 1 to 4 are filled out in the xlsx file. so I wonder if there's an issue with my count columns.

Since I cannot replicate my data in a simple way, I have attached an image :

in case you need to see my data, here's the image

Upon suggestion from @user20650 I ran the following code:

dput(tc[1:5, 3:6])

The following is the result:

structure(list(filing_type = c("10-K", "10-K", "10-K", "10-K", 
"10-K/A"), year = c("2014", "2013", "2012", "2011", "2010"), 
    Alabama_count = list(mktg_10k14.htm = 0L, mktg_10k.htm = 0L, 
        mktg_10k.htm = 0L, mktg_10k.htm = 0L, mktg_10ka.htm = 0L), 
    Alaska_count = list(mktg_10k14.htm = 0L, mktg_10k.htm = 0L, 
        mktg_10k.htm = 0L, mktg_10k.htm = 0L, mktg_10ka.htm = 0L)), row.names = c(NA, 
5L), class = "data.frame")

As @Andrew suggested, my count columns are all lists and not data frames.


Solution

  • Ok, based on the comments here is a solution (to unlist the columns). I also included the checks to ID the list-columns:

    # Reprodible example set-up
    mtcars <- datasets::mtcars
    
    mtcars$mpg <- as.list(mtcars$mpg) # manually adding a list column
    haven::write_dta(mtcars, "path.dta") # same error
    Error in write_dta_(data, normalizePath(path, mustWork = FALSE), version = stata_file_format(version),  : 
      Variables of type list not supported yet
    
    # ID the list column
    sapply(mtcars, is.list) # T/F vector
      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb 
     TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
    
    # or
    # sapply(mtcars, typeof) # returns column types
    

    How to fix it (based on the comments)

    idx <- sapply(mtcars, is.list) # create an index of your list columns
    mtcars[idx] <- lapply(mtcars[idx], unlist) # unlisting the list columns (selected by idx)
        
        
    any(sapply(mtcars, is.list)) # are any columns type list
    [1] FALSE