I have ran code to save names of variables which are character and binary in object binvar
print(binvar)
[1] "x1" "x2" "x3" "x4"
[5] "x5"
I want to use this to select columns in my the original dataframe df in order to run a for loop where I convert yes and no to 1 and 0.
for(i in 1:length(binvar)){
for(j in 1:length(nrow(df))){
if(df[[binvar[i]]][j]=="Yes"){
df[[binvar[i]]][j]<-1
}
else if(df[[binvar[i]]][j]=="No"){
df[[binvar[i]]][j]<-0
}
}
}
The problem is that the loop doesn't iterate over all elements inside the selected columns but only on the first element of each column. I thought iterator j would go over all element.
How can I solve this?
We don't need a nested loop in R
. The ==
is elementwise comparison operator. Subset the dataset ('df') based on the vector of column names ('binvar') i.e. df[binvar]
), create a logical matrix (== "Yes"
) and coerce the logical to binary (+
) and assign (<-
) back to the subset of columns (assuming there are only 'Yes', 'No' and possibly NA)
df[binvar] <- +(df[binvar] == "Yes")
In the OP's nested loop, the second expression is the problematic because nrow
returns a single value and the length
will be 1, so 1:1
is always 1.
for(j in 1:length(nrow(df)))
It should be
for(j in seq_len(nrow(df)))
If we want a reproducible example
> data(iris)
> nrow(iris)
[1] 150
> length(nrow(iris))
[1] 1
> 1:length(nrow(iris))
[1] 1