Let's say you want to predict the next time the boat will visit as a probability. You start taking observations at an arbitrary position in the boat cycle. When you make an observation you can only record if the boat is visible or not (assume if it is the right point in the cycle the boat is always visible). In this world the boat cycle length is also unknown but cyclical and the boat visit duration is unknown but always smaller than cycle length. Also assume the cycle is a fixed natural phenomena that probably wont change.
case 1. The first hour of observations you do not see a boat. Therefore predicted probability of there being a boat in the next hour will be arbitrary. The second hour we observe a boat, we predict probability high for hour 3. On hour 4 we observe no boat, we can now establish that the boat is usually observable for 2 hours (hour 2 and 3). We keep making observations, on hour 7 the boat is visible again. Only at this point do we know both the cycle length (5 hours) and duration the boat is observable (2 hours).
case 2. The first hour of observations you see a boat. Predicted probability is high for the next hour. On hour 4 you observe no boat. At this point boat visibility is at least 3 hours. we observe the boat again at hours 5, 6, 7, 8 and no boat at hour 9. Only after hour 9 can we safely say the cycle is 5 hours and visibility is 4 hours.
case 3. The first hour you see a boat. You go to sleep for 3 hours. On hour 5 you don't see a boat. You go to sleep for 3 hours. On hour 9 you see a boat. What's the probability of seeing a boat on hours 10,11,12?
what algorithm can I use to solve this? I'm thinking a hidden markov model might work because there is an underlying phenomena, but it is not directly observable. But in this case the phenomena isn't completely known. In my particular case, I can initialize the algorithm with average cycle lengths. The real motivation for creating this algorithm is that the observations are far and few in between. This program would be most valuable during the training phase because if the cycle lengths and our position in the cycle were known things would be trivial.
the following is roughly what could be outputted given 0,1,2 and 3 consecutive observations (X means an observation that saw the boat, O means no boat) using an average historical cycle length of 8 hours, and boat duration of 2 hours. Looking closely at the chart, you'll notice that there is a spread of increased probability around where the boat might return.
I'm not an expert on this kind of modelling, but I suggest you maintain competing theories.
Case 1:
Hour 1: No boat. So the length of the "off" phase is at least one, and the length of the off phase could be anything. We can write that as [1+, 0+]. The length of the cycle is (1+) + (0+) = 1+.
Hour 2: Boat. The model is now [1+, 1+], which does not predict hour 3, but we have seen the boat as often as not, so we calculate a probability of 1/2. The length of the cycle is 2+.
Hour 3: no observation. The theory splits. If there had been no boat, we'd have [1+, 1] (and predict 1/3); if there had been a boat, we'd have [1+, 2+] (and predict 2/3). So our model is {[1+,1],[1+,2+] and we predict 1/2.
Hour 4: No boat. We modify the theories: {[2+,1], [1+,2]} and predict 3/8.
Hour 5: No obs. The model bifurcates again:
[2+,1] -> [3+,1], [2,1]
[1+,2] -> [2+,2], [1,2]
Note that two of these theories claim to be complete (but make opposing predictions about hour 6). Prediction is 2/5 or 40%.
Hour 6: No obs. Incomplete theories bifurcate:
[3+,1] -> [4+,1], [3,1]
[2,1]
[2+,2] -> [3+,2], [2,2]
[1,2]
Prediction is 1/4.
Hour 7: Boat. This demolishes three theories, vindicates one, completes one, and causes one to split:
[4+,1] -> [4,1]
[3,1]
[2,1]
[3+,2] -> [3,2]
[2,2]
[1,2]
Period is 5, visibility is 1 or 2 hours. Prediction for hour 8 is 1/3.
Case 2:
Hour 1: Boat. [0+,1+]
Hour 2: No obs. [1+,1+], [0+,2+]. Predict 3/4.
Hour 3: No obs. [2+,1+], [1,1+], [1+,2+], [0+,3+]. Prob 2/3.
Hour 4: No boat. [3+,1+], [1,1], [2+,2+], [1+,3+]. Prob 5/8.
Hour 5: Boat. [3,1+], [1,1], [2,2+], [1,3+]. Prob 3/5.
Hour 6: Boat. [3,2+], [2,2+], [1,3+]. Prob 2/3.
Hour 7: Boat. [3,3+], [2,3+], [1,3+]. Prob 5/7.
Hour 8: Boat. [3,4+], [2,4+], [1,4+]. Prob 3/4.
Hour 9: No boat. [3,4], [2,4], [1,4]. Prob 1/3. Visibility is 4 hours, but period is unknown.
I won't work through case 3, but you get the idea.