Is there a way to import a named Excel-table into R as a data.frame?
I typically have several named Excel-tables on a single worksheet, that I want to import as data.frames, without relying on static row - and column references for the location of the Excel-tables.
I have tried to set namedRegion
which is an available argument for several Excel-import functions, but that does not seem to work for named Excel-tables. I am currently using the openxlxs package, which has a function getTables()
that creates a variable with Excel-table names from a single worksheet, but not the data in the tables.
To get your named table is a little bit of work.
First you need to load the workbook.
library(openxlsx)
wb <- loadWorkbook("name_excel_file.xlsx")
Next you need to extract the name of your named table.
# get the name and the range
tables <- getTables(wb = wb,
sheet = 1)
If you have multiple named tables they are all in tables
. My named table is called Table1.
Next you to extract the column numbers and row numbers, which you will later use to extract the named table from the Excel file.
# get the range
table_range <- names(tables[tables == "Table1"])
table_range_refs <- strsplit(table_range, ":")[[1]]
# use a regex to extract out the row numbers
table_range_row_num <- gsub("[^0-9.]", "", table_range_refs)
# extract out the column numbers
table_range_col_num <- convertFromExcelRef(table_range_refs)
Now you re-read the Excel file with the cols
and rows
parameter.
# finally read it
my_df <- read.xlsx(xlsxFile = "name_excel_file.xlsx",
sheet = 1,
cols = table_range_col_num[1]:table_range_col_num[2],
rows = table_range_row_num[1]:table_range_row_num[2])
You end up with a data frame with only the content of your named table.
I used this a while ago. I found this code somewhere, but I don't know anymore from where.