Search code examples
rselectmeasure

Select repeated cases from a variable


I am working with a database called Visitas. It is a visits database from a hospital in which the variable codeep represents the code of the patient. As the patients go to the hospital frequently, the patients have more than one measure. The variable in which they have more than one measure is called result, coded with the same code patient in the variable codeep each time they get a result in the result variable.

What I want is to get all the positions in the codeep variable, from each patient, in order to get the mean from the variable result, for each patient.

As an example, I present what I want for just one patient. I used which() and in return I get the vector with the positions where this code is repeated in the codeep variable. Now I would like to make this for all the patients, automatized.

In order to get this I tried a loop but it doesn't work, but maybe is my code.

which(Visitas[,'codeep'] == 6208)

# One loop

for (i in Visitas[, 'codeep']) {
    Visitas_TRT[i] <- which(Visitas$codeep[i] == Visitas$codeep)
} 

# Double loop

for (i in Visitas[, 'codeep']) {
  for (j in Visitas[, 'codeep']) {
    Visitas_TRT <- which(Visitas$codeep[i] == Visitas$codeep[j])
  }
} 

Any ideas?

This is head the dataset


Solution

  • Seems you're looking for ave, with which you may create a variable with the mean of result by each codeep.

    Visitas <- transform(Visitas, result.M=ave(result, codeep, FUN=mean))
    Visitas
    #    codeep today result result.M
    # 1       1     1    6.4 5.866667
    # 2       2     1    4.4 6.066667
    # 3       3     1    5.4 4.633333
    # 4       4     1    5.6 5.766667
    # 5       5     1    5.4 5.066667
    # 6       1     2    4.9 5.866667
    # 7       2     2    6.5 6.066667
    # 8       3     2    4.9 4.633333
    # 9       4     2    7.0 5.766667
    # 10      5     2    4.9 5.066667
    # 11      1     3    6.3 5.866667
    # 12      2     3    7.3 6.066667
    # 13      3     3    3.6 4.633333
    # 14      4     3    4.7 5.766667
    # 15      5     3    4.9 5.066667
    

    Data:

    Visitas <- expand.grid(codeep=1:5, today=1:3)
    set.seed(42)
    Visitas$result <- round(rnorm(nrow(Visitas), 5), 1)