Search code examples
statisticsvideo-streamingdata-analysis

How to detect bad video streams


I'm trying to do some data analysis on my streaming video site. To help determine if problems are on my side or the user's, I've started collecting the mean and standard deviation of their bandwidth for the streams. What I'm not sure about is how to determine what normal stream should look like.

To figure out what a normal stream should look like I was thinking of finding the following:

  1. Mean of Means - What is normal bandwidth
  2. StdDev of Means - How much does the population's bandwidth vary
  3. Mean of StdDevs - What's the normal amount of variation
  4. StdDev of StdDevs - How much does the average StdDev vary

Do these statistics make sense?

Basically, I'm trying to detect bad streams by looking for things like low bandwidth or highly variable bandwidth. So, I figured I could find some baselines and then look for outliers.

Also, keeping all the data for each sample is not feasible, so I can only work with aggregate statistics. If there's anything else you would suggest I log that would be a great help as well.


Solution

  • If the number of observations is big enough (say > 30) simply build a confidence interval (here I'm doing it with alpha=0.05)

    CI=[sample_mean-(sample_st.dev*1.96)/sqrt(# of observations);sample_mean+(sample_st.dev*1.96)/sqrt(# of observations)]

    That is a range for which you are confident that the mean of 95% of the samples falls in. To enlarge your confidence interval look up z(alpha/2) on a statistical table and substitute its value to 1.96 (which is for alpha=0.05).

    P.s. The following parameters don't make much sense to me.. Mean of StdDevs - What's the normal amount of variation StdDev of StdDevs - How much does the average StdDev vary