I have a data frame as shown below. What I want to do is, for example, the Aston Martin car is 4 years old, so I have to pull the relevant data from the year4_6 column. Or the Honda car is 7 years old, so I should use the year7_11 data.
I tried to write an if-else statement but I couldn't. I wonder what kind of code can I write for this, should I use a package?
hub.cc hub.make hub.age hub.year1_3 hub.year4_6 hub.year7_11
1300 Aston Martin 4 480 335 189
1800 Toyota 1 1352 1059 624
2000 Honda 7 2129 1642 965
1600 Honda 9 768 576 335
You can construct the column_name
string using the paste()
function and then select that column.
The issue here is that the suffixes _3
, _6
and _11
are inconvenient. So I will use the starts_with
functionality of the tidyverse
packages.
for row 3:
library(tidyverse)
age = df[3, "hub.age"]
column_name = paste("hub.year", age, sep="")
data = df %>%
slice(3) %>%
select(starts_with(column_name))
If you drop the suffixes (so you columns are named hub.year1, hub.year4, hub.year7
), it simplifies a lot. Then, you can use the following base R syntax for column referencing: df[, 'column_name']
.
for row 3 again:
age = df[3, "hub.age"]
column_name = paste("hub.year", age, sep="")
data = df[3, column_name]
If the data is small, you can just iterate over the dataframe with a for loop and replace 3
with i
to get the data for all rows. If the data is big, you might wanna use something like mutate()
from the tidyverse.
Note: I did not test the code above since you did not provide the data in a convenient reproducible way using dput
. If the answer does not meet your expectations, please clarify and provide the data e.g. with dput
.