Search code examples
randomdistributiontest-data

Which distributions can be used to produce starting times of jobs if there is no observation real state?


I need to produce some data which has starting times of each job (# of jobs: 30), I do not have chance to get real data so how can I generate data which shows similarities with a data distribution. In this case, which distribution should be good to go on?


Solution

  • A common technique used in simulation models where you don't have any data yet (e.g., data is very expensive, or it's a prospective system that does not even exist yet so where would you get the data from?) is to use a triangular distribution parameterized by subject matter experts (or your own best guesses) about the smallest, largest, and most common value you might see.

    A relatively new, but quite powerful extension to this would be to vary the parameter choices in a designed set of experiments to see how much it matters if your guesstimates are off. A well-designed experiment would allow you to assess and characterize how much your results change as a function of the parameter values.

    A more comprehensive variant would be to incorporate the distribution choice itself (triangle vs exponential vs anything else you think is plausible) into the design, to see whether that makes much of a difference. In the happy event that it doesn't, you can freely use a simple and convenient distribution choice such as the triangle; if it makes a big difference, you now have certain knowledge that you should get your hands on real data ASAP, because without that data based knowledge you're operating in a garbage-in-garbage-out mode. This also assumes that you control for, say, the first two moments as you switch between distribution choices so that your experiments are testing the shape of the distribution rather than the effect of mean and variance of the distribution.

    If designed experiments tell you it doesn't much matter, that's wonderful news. If it does matter, you now know more about the system than you did before and know where to focus your efforts going forward.