I have data on the arrival times of species to food. I want to be able to determine the levels of breed
that occur before the breed_jackals
and breed_hyena
levels for each carcass by using the got.here
value which is their arrival time.
I only want the order so in the first case for carcass_336
I'd get one value for the jackals which would be the breed_eagles
.
For the second carcass carcass_338
I'd have 2 levels for the hyena breed_lappets
and breed_eagles
in that order. And 3 levels for the jackal because the hyena arrives before it i.e. breed_lappets
, breed_eagles
& breed_hyena
.
I thought arrivals$breed[arrivals$mycarcass=="carcass_336"]
would work, but it gives me all the levels.
Ideally I'd also like to pick out which level occurs directly before the jackals and hyenas too by using the minimum got.here
for each. E.g. for carcass_338
it would be the breed_eagles
for breed_hyenas
. Again the got.here value will be useful I think because I've used that to extract the shortest arrival times for each carcass for another purpose with:
arrivals[ arrivals$got.here == ave(arrivals$got.here, arrivals$mycarcass, FUN=min), ]
Here's my data:
arrivals <- read.table(header=T, text="
who breed got.here mycarcass
167 breed_eagles 102 carcass_336
183 breed_eagles 108 carcass_336
181 breed_eagles 271 carcass_336
134 breed_eagles 284 carcass_336
191 breed_eagles 311 carcass_336
283 breed_jackals 5419 carcass_336
118 breed_lappets 200 carcass_338
198 breed_eagles 219 carcass_338
151 breed_eagles 256 carcass_338
206 breed_hyenas 1759 carcass_338
294 breed_jackals 7948 carcass_338
235 breed_hyenas 10988 carcass_338
215 breed_hyenas 13629 carcass_338
290 breed_jackals 17013 carcass_338")
The expected output I'd like would be derived from this and would be the frequencies of these occurrences. e.g. for jackals
preceeding_breed frequency
breed_eagles 1
breed_lappets 0
breed_hyenas 1
Here is one way to get count the arrivals by species prior to jackal arrival. There is probably a cleaner method. For clarity, I'm only going to show the solution for jackals, but getting the results for hyenas would be straightforward.
# for each carcass, calculate the first jackal arrival
first_jackals <- aggregate(got.here~mycarcass,
data=arrivals[arrivals$breed=="breed_jackals",], FUN=min)
# tabulate the number of other animals arriving before the jackal
beat_jackals <- sapply(unique(arrivals$mycarcass), function(i) {
table(arrivals$breed[arrivals$mycarcass==i &
arrivals$got.here < first_jackals$got.here[first_jackals$mycarcass==i]])})
This returns a matrix with the counts for each breed, including hyenas and Jackals. Now, we drop the hyenas and jackals from the count and add carcass names to the columns:
# drop unwanted breeds
beat_jackals <-
beat_jackals[row.names(beatJackals) != "breed_jackals",]
# add carcass names to the columns
colnames(beat_jackals) <- unique(arrivals$mycarcass)
because sapply
processed the carcasses in the same order, unique(arrivals$mycarcass)
, we don't have to worry about misalignment.
To get the order of arrival by breed to each carcass, you can use the following:
arrival_order <- sapply(unique(arrivals$mycarcass), function(i) {
unique(arrivals[arrivals$mycarcass==i, "breed"])})
This will allow you to pull out the breed that arrived immediately prior to the jackal:
sapply(arrival_order, function(i) i[(which(i=="breed_jackals"))-1])