Search code examples
rpathfs

fs::dir_ls() returns unreadable output for file paths with chinese characters inside


I'm using fs::dir_ls() to get excel file paths under a directory main_path, but since there are Chinese characters in the file paths, I get an unreadable output, I think it's maybe related to encoding issues:

main_path <- '../../raw_data/2022-01-10/'
file_paths <- fs::dir_ls(main_path, regexp = ".xlsx") 
file_paths

Out:

../../raw_data/2022-01-10/閽㈤搧_鐒︾偔_浠峰樊_鐒︾偔J2201DCE鐒︾偔J2205DCE_涓诲姏_2022-01-10.xlsx
../../raw_data/2022-01-10/閽㈤搧_鐒︾偔_浠峰樊_鐒︾偔J2201DCE鐒︾偔J2209DCE_2022-01-10.xlsx

While list.files(path = main_path, pattern ='.xlsx') returns file names correctly:

[1] "甘其毛道库提价含税焦煤A23V2606SG85JM焦煤JM2201DCE_2022-01-10.xlsx"           
  [2] "甘其毛道库提价含税焦煤A23V2606SG85JM焦煤JM2205DCE_2022-01-10.xlsx"

The version of fs package I use:

Warning message:
package ‘fs’ was built under R version 4.1.2 

Does someone know how to deal with this issue? or if there are equivalent methods to get excel file paths under a directory in R? Thanks.

Update:

I didn't find the raison to cause this error, it maybe related to locale of RStudio since it works on my Mac but not on Windows 10 machine, I've set Sys.setlocale("LC_ALL","zh_CN.utf-8"), not work either, but I find an alternative solution:

file_names <- list.files(path = main_path, pattern ='.xlsx')
file_paths <- file.path(main_path, file_names)

Out:

[1] "../../raw_data/2022-01-10/甘其毛道库提价含税焦煤A23V2606SG85JM焦煤JM2201DCE_2022-01-10.xlsx"           
  [2] "../../raw_data/2022-01-10/甘其毛道库提价含税焦煤A23V2606SG85JM焦煤JM2205DCE_2022-01-10.xlsx"

Reference:

https://github.com/r-lib/fs/issues/281

https://github.com/r-lib/fs/issues/164


Solution

  • The problem is avaliable while the version of fs package is higher than 1.5.0, so downgrade the fs to 1.5.0 can fix it with the code:

    devtools::install_version("fs", "1.5.0")