The name of my variables looks like this:
df <- data.frame(var_NA = 1:10, var = 11:20, var_Level = 21:30, var_Total = 31:40)
Except I have lots of variables. The key feature is that for every "mother" variable var
, there are many "child" variables with different names (like var_NA
and var_Level
). Some "mothers" have more "children" than other. One thing is fixed though: there is always a child with suffix _NA
.
What I want is to order columns like this:
_NA
child_NA
childIn my example, outcome would be var
,var_NA
,var_Level
,var_Total
.
I've given up trying with select(ends_with())
, relocate()
and other comments. This is probably done best with regex, of which I am totally ignorant. Any ideas?
Have updated answer to correspond to the changes in the question.
Create nms
to be the the names of df
except that the name ending in _NA is replaced with the same name ending in just _ so that it sorts earlier. Note that the $ in _NA$ means the end so that _NA$ only matches to a name ending in _NA .
Now the sorted order of nms
applied to the columns of df
sorts the columns of df
as desired.
nms <- sub("_NA$", "_", names(df))
df[order(nms)]
giving (continued after output):
var var_NA var_Level var_Total
1 11 1 21 31
2 12 2 22 32
3 13 3 23 33
4 14 4 24 34
5 15 5 25 35
6 16 6 26 36
7 17 7 27 37
8 18 8 28 38
9 19 9 29 39
10 20 10 30 40
Note that the actual sort order will depend on the LC_COLLATE setting of the locale. For example, note below that numbers sort before letters in both the English and C locale examples; however, in the C locale all upper case letters come before all lower case but not in the English locale. In the above solution the var column will come first and the var_NA column will come second (as it corresponds to var_ in nms) in both locales but the actual order within the remaining names will be locale dependent.
Sys.getlocale() # shows locale being used including LC_COLLATE
## ..snip..
x <- c("0", "1", "2", "a", "b", "c", "A", "B", "C")
Sys.setlocale("LC_COLLATE", "en_US.utf8")
sort(x)
## [1] "0" "1" "2" "a" "A" "b" "B" "c" "C"
Sys.setlocale("LC_COLLATE", "C")
sort(x)
## [1] "0" "1" "2" "A" "B" "C" "a" "b" "c"
Sys.setlocale("LC_COLLATE", "") # set locale back to default