I currently have this loop to trim rows from a dataset (df_2) based on a range of indices, the start and end indices for the sections to include being taken from 2 columns in df_3, and create a new file (df).
for(i in 1:nrow(df_3)){
if (i==1) df <- df_2[df_3$start[i]:df_3$end[i],]
else df <- rbind(df,df_2[df_3$start[i]:df_3$endi],])
}
Each section has a value associated with it, which is contained in column 3 of df_3. I want to create a new column in df that repeats the values associated with that section.
Would really appreciate some assistance here feel free to ask for clarification - was as succinct as I could make it!
As suggested by Joran - here are some examples
DF
index new_column
0
1
2
3
4
5
6
7
8
9
10
DF_3
start _end new_column_values
0 3 1
4 6 2
7 10 3
If I understand your question correctly, you might be able to use cut
as follows:
DF$new_column <- cut(DF$index,
breaks = c(DF_3$start[1], DF_3$end),
include.lowest = TRUE,
labels = DF_3$new_column_values)
DF
index new_column
1 0 1
2 1 1
3 2 1
4 3 1
5 4 2
6 5 2
7 6 2
8 7 3
9 8 3
10 9 3
11 10 3
In this, I'm trying to make use of the available information. We are basically creating a factor for DF$index
and the factor levels are determined by ranges found in another data.frame
. Thus, for cut
, I've set breaks
to be a vector comprising the first start value and all the end values, and I've set the "labels" to be the values from the "new_column_values" variable.
Note that the resulting "new_column" is not (in the current form) a numeric variable, but a factor.