Suppose I have this csv file:
asdf,qwer,asdf,qwer,qwer
1,2,3,4,5
If I use readr::read_csv("some.csv")
to read it I will obtain new column names for duplicates based on the position of the column.
# A tibble: 1 × 5
asdf...1 qwer...2 asdf...3 qwer...4 qwer...5
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2 3 4 5
What could I do if I'd rather have names with suffixes based on the number of duplicates and with no modification for the first occurence like that:
# A tibble: 1 × 5
asdf qwer asdf_1 qwer_1 qwer_2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2 3 4 5
It seems possible to use the name_repair
argument of read_csv
and provide a function.
Since name_repair=
can be a function, we can deal with it programmatically. Fortunately, base::make.unique
does most of it, and we can customize it with sep="_"
to get your exact output.
namefun <- function(nm) make.unique(nm, sep = "_")
txt <- 'asdf,qwer,asdf,qwer,qwer
1,2,3,4,5'
readr::read_csv(txt, name_repair = namefun)
# Rows: 1 Columns: 5
# ── Column specification ───────────────────────────────────────────────────────────────────────────────────────────
# Delimiter: ","
# dbl (5): asdf, qwer, asdf_1, qwer_1, qwer_2
# ℹ Use `spec()` to retrieve the full column specification for this data.
# ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# # A tibble: 1 × 5
# asdf qwer asdf_1 qwer_1 qwer_2
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 2 3 4 5