I have a question about calculating the percentage by items and time bins. The experiment is like this:
I conduct an eye-tracking experiment. Participants were asked to describe pictures consisting of two areas of interest(AOIs; I name them Agent
and Patient
). Their eye movements (fixations on the two AOIs) were recorded along the time when they plan their formulation. I worked out a dataset included time information and AOIs as below (The whole time from the picture onset was divided into separate time bins, each time bin 40 ms).
Stimulus Participant AOIs time_bin
1 M1 agent 1
1 M1 patient 2
1 M1 patient 3
1 M1 agent 4
...
1 M2 agent 1
1 M2 agent 2
1 M2 agent 3
1 M2 patient 4
...
1 M3 agent 1
1 M3 agent 2
1 M3 agent 3
1 M3 patient 4
...
2 M1 agent 1
2 M1 agent 2
2 M1 patient 3
2 M1 patient 4
I would like to create a table containing the proportion of one AOI (e.g. agent) by each stimulus of each time bin. It would be like this:
Stimulus time_bin percentage
1 1 20%
1 2 40%
1 3 55%
1 4 60%
...
2 1 30%
2 2 35%
2 3 40%
2 4 45%
I calculate the percentage because I want to do a multilevel analysis (Growth Curve Analysis) investigating the relationship between the dependent variable agent fixation proportion
and the independent variable time_bin
, as well as with the stimulus
as a random effect.
I hope I get my question understood, due to my limited English knowledge.
If you have an idea or a suggestion, that would be a great help!
Using the tidyverse
package ecosystem you could try:
library(tidyverse)
df %>%
mutate(percentage = as.integer(AOIs == "agent") ) %>%
group_by(Stimulus, time_bin) %>%
summarise(percentage = mean(percentage))
Note that this will give you ratios in the [0, 1]
interval. You still have to convert it to the percentage values by multiplying with 100 and appending "%".