Search code examples
rdataframe

Maintaining number formatting when exporting to excel


I have this dataframe

df <- data.frame(status=c('single', 'married', 'married', "fff"),
                 education=c(3456, 898989899, 23939004, NA),
                 income=c('34,5%', '88.456%', '92', NA))
 

I did some modifications in the formating of the above numbers using

my_numbers <- function(x) structure(x, class = c('my_numbers', 'numeric'))  
format.my_numbers <- function(x,...)NextMethod(sci = FALSE, big.mark='.')
print.my_numbers <- function(x,...) print(format(x), quote = FALSE)    
'[.my_numbers' <- function(x,...,drop = FALSE)  my_numbers(NextMethod('['))
 
x <- my_numbers(df)

df <- lapply(df, my_numbers)
df2<- bind_rows(df)



 df2
    A tibble: 4 × 3
      status     education   income    
      <my_nmbrs> <my_nmbrs>  <my_nmbrs>
    1 single           3.456 34,5%     
    2 married    898.989.899 88.456%   
    3 married     23.939.004 92   
     4 fff                 NA NA   

I want to maintain this formating when extracting df2 to excel. I tried

library(xlsx) 

----
wb <- createWorkbook() 
CellStyle(wb) 
sheet <- createSheet(wb, sheetName = "Sheet1") 
  
addDataFrame(df2, sheet)
saveWorkbook(wb = wb, file = "New_df.xlsx")

but with no success. Any ideas?

An alternative approach would be the following

library(xlsx)
df1<-df %>%   mutate(across(where(is.numeric), ~formatC(., format="d", big.mark=".")))
df1[df1=='NA']<-NA

 
wb <- createWorkbook() 
CellStyle(wb) 
sheet <- createSheet(wb, sheetName = "Sheet1") 
addDataFrame(df1, sheet)
saveWorkbook(wb = wb, file = "New_df.xlsx")

which seems to be working and it is very fast given the millions of cells that I have. Does this make sense?

I have 300000 rows and 900 columns


Solution

  • You have been asking the same question several times. Here is a solution that should work for your xlsx. Note that only the numeric columns are impacted

    library(openxlsx)
    df <- data.frame(status=c('single', 'married', 'married', "fff"),
                     education=c(3456, 898989899, 23939004, NA),
                     income=c('34,5%', '88.456%', '92', NA),
                     another_col = sample(200000, 4))
    df1 <- df %>%
      mutate(income = parse_number(income) /100^grepl("%", income))
    
    wb <- createWorkbook()
    addWorksheet(wb, "Sheet1")
    writeData(wb, "Sheet1", df1)
    
    addStyle(wb, "Sheet1", style = createStyle(numFmt = "#,##0.00"), 
             rows = seq(nrow(df)) + 1, 
             cols = which(sapply(df, is.numeric)), gridExpand = TRUE)
    
    
    
    addStyle(wb, "Sheet1", style = createStyle(numFmt = "0.00%"), 
             rows = seq(nrow(df)) + 1, 
             cols = grep("%", df), gridExpand = TRUE)
    
    saveWorkbook(wb, "numeric_formatted_numbers.xlsx", overwrite = TRUE)
    

    enter image description here

    Note that columns B and D are numeric while column C is in percentage format which is still numeric