Search code examples
r

Manipulate Properties of Files in Working Directory


I am trying to import a (multi sheet) "xlsb" file into R. Normally, I would have done this using the readxlsb library, e.g. How to open an .xlsb file in R? , but I can not install this library on my school/university computer.

I realized I can manually change the format from xlsb to xlsx, and then use the readxl::read_excel function to import this (How to import multiple xlsx sheets in R)

This got me thinking - using the system() command, is it possible to change the format of the xlsb file directly in R ... and then use the readxl library in R to import?

For example, I know in R its possible to delete a file from the working directory:

a = data.frame(a = rnorm(100,100,100), b = rnorm(100,100,100))
write.csv(a, "Test.csv")

wd <- getwd()
file_path <- file.path(wd, "Test.csv")

file.remove(file_path)

Is it also possible to change the format of a file situated in the working directory?

EDIT: Alternate Way? (inspired from Quinten's answer)

install.packages("openxlsx")
install.packages("readxl")
library(openxlsx)
library(readxl)


a <- data.frame(a = rnorm(100, 100, 100), b = rnorm(100, 100, 100))

write.csv(a, "Test.csv")

data <- read.csv("Test.csv")
write.xlsx(data, "Test.xlsx")

convert_to_xlsb <- function(xlsx_file, xlsb_file) {
  wb <- loadWorkbook(xlsx_file)
  saveWorkbook(wb, xlsb_file, overwrite = TRUE)
}

convert_to_xlsb("Test.xlsx", "Test.xlsb")

convert_xlsb_to_xlsx <- function(xlsb_file, xlsx_file) {
  wb <- loadWorkbook(xlsb_file)
  saveWorkbook(wb, xlsx_file, overwrite = TRUE)
}

convert_xlsb_to_xlsx("Test.xlsb", "Test_converted.xlsx")

Solution

  • Yes, you can create a VBScript script, which does the converting. Then you call this script from R

    ConvertXLSBtoXLSX.vbs

    Option Explicit
    
    Dim objExcel, objFSO, objFolder, objFile, objWorkbook
    Dim inputFolder, outputFile
    Dim fileExtension
    
    ' Get the folder path from command line arguments
    inputFolder = WScript.Arguments(0)
    
    ' Create FileSystemObject
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    
    ' Create Excel Application object
    Set objExcel = CreateObject("Excel.Application")
    objExcel.Visible = False ' Do not show Excel window
    objExcel.DisplayAlerts = False ' Suppress prompts to automatically overwrite files
    
    ' Get the folder object
    Set objFolder = objFSO.GetFolder(inputFolder)
    
    ' Loop through each file in the folder
    For Each objFile In objFolder.Files
        fileExtension = LCase(objFSO.GetExtensionName(objFile.Name))
        ' Check if the file is a .xlsb file
        If fileExtension = "xlsb" Then
            ' Open the .xlsb file
            Set objWorkbook = objExcel.Workbooks.Open(objFile.Path)
            
            ' Define output file path with .xlsx extension
            outputFile = objFSO.BuildPath(inputFolder, objFSO.GetBaseName(objFile.Name) & ".xlsx")
            
            ' Save the file as .xlsx
            objWorkbook.SaveAs outputFile, 51 ' 51 is the file format for .xlsx
            
            ' Close the workbook
            objWorkbook.Close False
        End If
    Next
    
    ' Quit Excel Application
    objExcel.Quit
    
    ' Clean up
    Set objWorkbook = Nothing
    Set objExcel = Nothing
    Set objFSO = Nothing
    

    convert.R

    script <- "path/to/script/ConvertXLSBtoXLSX.vbs"
    path <- "path/to/xlsb/folder"
    system2("cscript", args = c(shQuote(script), shQuote("//NoLogo"), shQuote(path)))
    

    Of course you want to add some more checks (e.g. is the input path existing) and decide what to do, if there is already a xlsx file of the same name (here I overwrite w/o asking), but this should get you started.