Search code examples
rdistributionexponentialdata-fitting

Fitting Exponential Distribution to Task Duration Counts


In my dataset, I have ants that switch between one state (in this case a resting state) and all other states over a period of time. I am attempting to fit an exponential distribution to the number of times an ant spends in a resting state for some duration of time (for instance, the ant may rest for 5 seconds 10 times, or it could rest for 6 seconds 5 times, etc.). While subjectively this distribution of durations seems to be exponential, I can't fit a single parameter exponential distribution (where the one parameter is rate) to the data. Is this possible to do with my dataset, or do I need to use a two parameter exponential distribution?

I am attempting to fit the data to the following equation (where lambda is rate):

lambda * exp(-lambda * x).

This, however, doesn't seem to be mathematically possible to fit to either the counts of my data or the probability density of my data. In R I attempt to fit the data with the following code:

 fit = nls(newdata$x.counts ~ (b*exp(b*newdata$x.mids)), start = 
 list(x.counts = 1, x.mids = 1, b = 1)) 

When I do this, though, I get the following message:

 Error in parse(text= x, keep.source = FALSE): 
 <text>:2:0: unexpected end of input
 1: ~
    ^

I believe I am getting this because its mathematically impossible to fit this particular equation to my data. Am I correct in this, or is there a way to transform the data or alter the equation so I can make it fit? I can also make it fit with the equation lambda * exp(mu * x) where mu is another free parameter, but my goal is to make this equation as simple as possible, so I would prefer to use the one parameter version.

Here is the data, as I can't seem to find a way to attach it as a csv: https://docs.google.com/spreadsheets/d/1euqdgHfHoDmQKXHrtOLcn5x5o81zY1sr9Kq6NCbisYE/edit?usp=sharing


Solution

  • First, you have a typo in your formula, you forgot the - sign in

    (b*exp(b*newdata$x.mids))
    

    But this is not what is throwing the error. The start parameter should be a list that initializes only the parameter value, not x.counts nor x.mids.

    So the correct version would be:

    fit = nls(newdata$x.counts ~ b*exp(-b*newdata$x.mids), start = list(b = 1))