I have a dataframe with multiple columns and I am using a for loop to apply a mathematical operation that is being recorded in new columns. The dataframe is named "F39". The code I have written is as follows:
for (i in 2:nrow(F39)) {
#calculating distance from distance formula (both in x and y)
F39$distance[i] <- sqrt((F39$X..cm.[i]-F39$X..cm.[i-1])^2 + (F39$Y..cm.[i]-F39$Y..cm.[i-1])^2)
#calculating fish speed in x and y
F39$fishspeed[i] <- F39$distance[i]/(0.02)
#assigning 0 as the starting fish speed
F39$fishspeed[1] <- 0
#assigning positive and negative signs to the velocity
F39$fishspeed[i] <- ifelse(F39$X..cm.[i]-F39$X..cm.[i-1] < 0,F39$fishspeed[i],-F39$fishspeed[i])
}
However, it gives me the following error:
Error in $<-.data.frame
(*tmp*
, "distance", value = c(NA, 0.194077783375631 :
replacement has 2 rows, data has 4837
There are 4837 rows in my dataframe. I have many other data frames where I am applying the same code and it is working but here and in some other data frames, it is not working.
I have added the .CSV file with data in google drive: Link to csv file
Your data.frame is missing the column "distance". Therefore it could not save any value in this column using the syntax F39$distance[i] <- ...
The solution would be to create first the column and than do the iteration, e.g.
F39 <- read.csv("C:/Users/kupzig.HYDROLOGY/Downloads/Fish39.csv")
names(F39) #-> no distance as column name
F39$fishspeed[1] <- 0 #assigning 0 as the starting fish speed
F39$distance <- NA #create the distance column
for (i in 2:nrow(F39)) {
#calculating distance from distance formula (both in x and y)
F39$distance[i] <- sqrt((F39$X..cm.[i]-F39$X..cm.[i-1])^2 + (F39$Y..cm.[i]-F39$Y..cm.[i-1])^2)
#calculating fish speed in x and y
F39$fishspeed[i] <- F39$distance[i]/(0.02)
#assigning positive and negative signs to the velocity
F39$fishspeed[i] <- ifelse(F39$X..cm.[i]-F39$X..cm.[i-1] < 0,F39$fishspeed[i],-F39$fishspeed[i])
}
Note that it would be clever to put all operations outside the loop which are independent from i or independent from any other pre-step which is dependent on i. This will save you in the future calculation time.