I am trying to cluster customers consumption behaviors using time series techniques. Customers buy tokens and use them whenever they want (a max of 4 tokens per day). This is a sample of what the customers journeys time series (x = days after first order , y = number of tokens consumed per day) and it look alike the image below.
I tried clustering with derived variables (median delay between two events, standard deviation of the delays, total number of tokens, time between first and last consumption, mean number of tokens consumed per consumption event ...). I used K-means and this gave me some good results but it wasn't enough to spot all patterns in data. I looked at some papers about the use of Dynamic time warping in such cases but I have never used such algorithms.. Is there any materials (demos) on the use of such algorithms to cluster such time series ?
Yes.
There are many techniques that can be useful here.
The obvious approach from literature would be HAC with DTW.