this is my first project in R, after just having learned java.
I have a (large) data set that I have imported from a csv file into data frame.
I have identified the two relevent columns for this question, the first that has the name of the patient, and second that asks the patient the level of swelling.
The level of swelling is relative i.e. better, worse or about the same.
Not all patients have the same number of observations.
I am having difficulty converting these relative values into numerical values that can be used as part of a greater analysis.
Below is psuedocode to what i think could be an appropriate solution:
for row in 'patientname'
patientcounter = dtfr1[row, 'patientname'];
if dtfr1[row, 'patientname'] == patientcounter
if dtfr1[row, 'Does.you.swelling.seem.better.or.worse'] == 'better'
conditioncounter--;
dtfr1[row, 'Does.you.swelling.seem.better.or.worse'] = conditioncounter;
elseif [row, 'Does.you.swelling.seem.better.or.worse'] == 'better'
conditoncounter++;
dtfr1[row, 'Does.you.swelling.seem.better.or.worse'] = conditioncounter;
else
dtfr1[row, 'Does.you.swelling.seem.better.or.worse'] = conditioncounter;
if dtfr1[row, 'patientname'] =! patientcounter
patientcounter = dtfr1[row, 'patientname'];
What would your advice be for a good solution to this problem? Thanks!
If I'm understanding correctly, you want the difference in the counts of worse
and better
, by patient? If so, something like this would work.
# Simulated data
dtfr1 <- data.frame(patient = sample(letters[1:3], 100, replace=TRUE),
condition = sample(c("better", "worse"), 100, replace=TRUE))
head(dtfr1)
# patient condition
# 1 a worse
# 2 b better
# 3 b worse
# 4 a better
# 5 c worse
# 6 a better
better_count <- tapply(dtfr1$condition, dtfr1$patient, function(x) sum(x == "better"))
worse_count <- tapply(dtfr1$condition, dtfr1$patient, function(x) sum(x == "worse"))
worse_count - better_count
# a b c
# 5 0 -1