I'm implementing a loop in R. Let me simplify things as much as I can.
Suppose I have
# country year key potential
# 1 FRA 2010 FRA2010 0
# 2 FRA 2011 FRA2011 0
# 3 FRA 2012 FRA2012 0
# 4 FRA 2013 FRA2013 1
# 5 ITA 2010 ITA2010 1
# 6 ITA 2011 ITA2011 1
# 7 ITA 2012 ITA2012 0
# 8 ITA 2013 ITA2013 1
# 9 USA 2010 USA2010 0
# 10 USA 2011 USA2011 0
# 11 USA 2012 USA2012 1
# 12 USA 2013 USA2013 1
Then, I take the unique values satisfying potential=1
unique <- unique(df$key[df$potential == 1])
Then, I want to have the mean year for each country such that potential == 1. I wanna have the min year by country where potential == 1 as well.
That's my attempt:
for (i in unique) {
mean_year <- mean(df$year[df$key == i], na.rm = TRUE)
date ,- min(df$year[df$key == i], na.rm = TRUE)
}
The loop returns one value per mean_year and date, respectively. Instead, it should return one value per each country for both mean_year and date.
For mean_year I should have: 2013 for FRA, 2011.33 for ITA, and 2012.5 for USA.
The same reasoning should occur for date.
df <- structure(list(country = c("FRA", "FRA", "FRA", "FRA", "ITA",
"ITA", "ITA", "ITA", "USA", "USA", "USA", "USA"), year = c(2010L,
2011L, 2012L, 2013L, 2010L, 2011L, 2012L, 2013L, 2010L, 2011L,
2012L, 2013L), key = structure(1:12, levels = c("FRA2010", "FRA2011",
"FRA2012", "FRA2013", "ITA2010", "ITA2011", "ITA2012", "ITA2013",
"USA2010", "USA2011", "USA2012", "USA2013"), class = "factor"),
potential = c(0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1)), row.names = c(NA,
-12L), class = "data.frame")
I wouldnt use a loop to do such calculations, as tidyverse is much more elegant; but I've provided this for loop example as you were quite insistent, and I think this shows a reasonable approach to hand crafted looping.
df <- structure(list(
country = c(
"FRA", "FRA", "FRA", "FRA", "ITA",
"ITA", "ITA", "ITA", "USA", "USA", "USA", "USA"
), year = c(
2010L,
2011L, 2012L, 2013L, 2010L, 2011L, 2012L, 2013L, 2010L, 2011L,
2012L, 2013L
), key = structure(1:12, levels = c(
"FRA2010", "FRA2011",
"FRA2012", "FRA2013", "ITA2010", "ITA2011", "ITA2012", "ITA2013",
"USA2010", "USA2011", "USA2012", "USA2013"
), class = "factor"),
potential = c(0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1)
), row.names = c(
NA,
-12L
), class = "data.frame")
# basic prep, you say you only want to consider potential == 1
# so simply shorten the data and then no need to think on it more
(sub_df <- subset(df, df$potential == 1))
# your description says to loop over countries ; keys seem irrelevant
(countrycodes <- unique(sub_df$country))
(lc <- length(countrycodes))
# making an empty structure of the desired size to contain the results
(res <- data.frame(
country = character(lc),
mean = numeric(lc),
min = numeric(lc)
))
# the loop
for (i in seq_len(lc)) {
ctry <- countrycodes[i]
years <- sub_df$year[sub_df$country == ctry]
res[i, ] <- data.frame(
country = ctry,
mean = mean(years),
min = min(years)
)
}
res