Search code examples
matlabprobabilityprobability-theory

how to create simulated censored dataset in MATLAB


I would like to write a code to generate a dataset censored with 1 censored data point and varying percent censored. I have the following code to generate some random numbers but not censored

n=input('Enter sample size:');
GM=input('Enter geometric mean:'); 
GSD=input('Enter geometric standard deviation:');
m=input('Enter desired number of dataset:');
x = lognrnd(log(GM), log(GSD),n,m);

I have the following code to create a censored dataset with known a limit of detection (lod) value (LOD) and then calculate the percent censored value and there I have a dataset to work with.

c = (x > lod); % c are values less than this number 
x(c) = lod;  % create single lod
sum(c)/length(c) % calculate percent censored

but what I want to do it is to provide the computer the desired percent censored and have the computer find the lod corresponds to that percent censored. I can manually put in the lod value but that takes very long time if i want to create a dataset with percent censored 5-95.

The goal is to create varying censored datasets with varying percent censored for a simulation. I've been doing it one dataset at a time and it's taking a very long time. Please let me know if this all makes sense.


Solution

  • If you have Statistical Toolbox you can use functions PRCTILE:

    pct = 10;
    lod = prctile(x, pct);
    

    or QUANTILE (it actually uses prctile inside).

    pct = 0.1;
    lod = quantile(x,pct);