I have this dataframe
df <- data.frame(status=c('single', 'married', 'married', "fff"),
education=c(3456, 898989899, 23939004, NA),
income=c('34,5%', '88.456%', '92', NA))
I did some modifications in the formating of the above numbers using
my_numbers <- function(x) structure(x, class = c('my_numbers', 'numeric'))
format.my_numbers <- function(x,...)NextMethod(sci = FALSE, big.mark='.')
print.my_numbers <- function(x,...) print(format(x), quote = FALSE)
'[.my_numbers' <- function(x,...,drop = FALSE) my_numbers(NextMethod('['))
x <- my_numbers(df)
df <- lapply(df, my_numbers)
df2<- bind_rows(df)
df2
A tibble: 4 × 3
status education income
<my_nmbrs> <my_nmbrs> <my_nmbrs>
1 single 3.456 34,5%
2 married 898.989.899 88.456%
3 married 23.939.004 92
4 fff NA NA
I want to maintain this formating when extracting df2 to excel. I tried
library(xlsx)
----
wb <- createWorkbook()
CellStyle(wb)
sheet <- createSheet(wb, sheetName = "Sheet1")
addDataFrame(df2, sheet)
saveWorkbook(wb = wb, file = "New_df.xlsx")
but with no success. Any ideas?
An alternative approach would be the following
library(xlsx)
df1<-df %>% mutate(across(where(is.numeric), ~formatC(., format="d", big.mark=".")))
df1[df1=='NA']<-NA
wb <- createWorkbook()
CellStyle(wb)
sheet <- createSheet(wb, sheetName = "Sheet1")
addDataFrame(df1, sheet)
saveWorkbook(wb = wb, file = "New_df.xlsx")
which seems to be working and it is very fast given the millions of cells that I have. Does this make sense?
I have 300000 rows and 900 columns
You have been asking the same question several times. Here is a solution that should work for your xlsx. Note that only the numeric columns are impacted
library(openxlsx)
df <- data.frame(status=c('single', 'married', 'married', "fff"),
education=c(3456, 898989899, 23939004, NA),
income=c('34,5%', '88.456%', '92', NA),
another_col = sample(200000, 4))
df1 <- df %>%
mutate(income = parse_number(income) /100^grepl("%", income))
wb <- createWorkbook()
addWorksheet(wb, "Sheet1")
writeData(wb, "Sheet1", df1)
addStyle(wb, "Sheet1", style = createStyle(numFmt = "#,##0.00"),
rows = seq(nrow(df)) + 1,
cols = which(sapply(df, is.numeric)), gridExpand = TRUE)
addStyle(wb, "Sheet1", style = createStyle(numFmt = "0.00%"),
rows = seq(nrow(df)) + 1,
cols = grep("%", df), gridExpand = TRUE)
saveWorkbook(wb, "numeric_formatted_numbers.xlsx", overwrite = TRUE)
Note that columns B
and D
are numeric while column C
is in percentage format which is still numeric