Search code examples
matlabfor-loopdata-analysis

Running mean of a data or Binning of a data


I have two columns as follows.

ABC =

4.1103   25.5932
5.0852   31.2679
6.0021   15.9020
5.8495   21.4804
4.3245   19.9674
5.9378   38.3452
6.9460    8.8233
7.4568   44.7429
5.7358   32.7608
5.3510   35.2645
5.1657   54.6566
5.1381   44.1870
4.1566  101.8947
5.7310   -3.0565
5.5496   28.3637
4.5672   -1.7736
4.5805   11.8384
4.7948   33.7640
3.9901    6.0607
4.4203   17.7308
4.2712   -1.5834
4.8808   -2.3123
5.9004   -0.4623
5.3929    1.1477
5.6594    6.9741
5.5114   11.3982
5.4715    5.9189
5.0021    6.2561
4.1576   10.3207
6.1025    3.4654
3.9960    6.6892
5.6938    3.8429
5.2416    7.7513
7.0922    2.6871
5.3277   14.0617
6.1350    4.0316
6.0211  -20.3587
6.7399   14.0224
5.0818  102.6360
5.6444   24.3167
6.2542   19.8522
6.2862   24.3430
5.6452   -6.4020
5.4561   14.7813
4.7934    9.4639
3.8523   32.0766
3.9878    8.5313
4.5232   42.0309
4.2489  -12.0325
6.0413   -5.5464
4.9334   -3.2520
4.1349   20.9038
4.2329   20.6303
4.2009   31.8840
4.0624   48.5402
4.7674   28.6595
4.0767    4.7767
4.0971   34.8460
3.8442   24.0209
5.2471   38.8815
6.0241   59.3785
6.9743    6.5027
7.8732    4.5422
4.3094   68.4340
4.5601   -4.2946
4.6140  109.4510
4.5862   71.8387
5.2210   66.1310
4.3835   32.7592
6.1432   36.3832
5.4624   13.7891
5.2129   40.1301
3.8987   67.2705
6.6328   15.0286
8.0786   -7.3078
4.8968   -6.7754
4.1200    4.5333
4.1098   -3.3204
4.0373   26.4890
3.8467   48.8121
7.7795   -2.3606
6.9553   21.3609
6.2635   24.4985
6.1518   -1.4200
4.9115   11.5784
5.5908   13.1351
7.0117   -2.8297
5.2193   38.6937
6.0786   16.9453
6.8229   14.0907
8.0385   13.6228
8.6596   -1.4478
6.3257    8.0361
6.9223  -14.2179
3.8337   15.5773
4.0039  -24.1494
4.6332   17.9308
6.3684   11.3398
5.8592    4.0367
6.9040   12.1495
7.8524   -0.0432
8.3545   10.8865
9.3946   20.4614
4.3015   25.9674
4.4782   21.9045
4.1994   39.2286
4.3499   22.1004
4.3652   33.6220
4.2026   -5.8153
5.1330    6.4996
5.3118   33.7835
4.2002   -3.1917
3.8285   32.1016
3.9485   21.6358
3.8688   21.7830
4.0494   24.7914
4.0869   10.6577
4.6699    8.4756
5.1199   11.1885
5.1831    8.6163
4.5560    8.2806
4.4886    4.8017
4.5618    5.9434
4.1135   12.8942
4.1377   22.1423

I made equal no. of bins from 'x' and corresponding mean bin value 'yy'. as shown below

x=ABC(:,1);
y=ABC(:,2);
counter=1
    for i=min(x):0.3:max(x)     
         bin= x>i &  x<= i+0.3;       
         xbin(counter,1)  = mean(x(bin)); 
         yy(counter,1)    = mean(y(bin));
         counter          = counter+1
    end

plot(x,y,'ro'); hold on
plot(xbin,yy,'bo-'); 

Where a 'bin' is defined for certain range of 'x'(please see for loop).Now out put contains 'xbin' from 'x' and mean of data 'yy' from 'y' corresponding 'xbin'. I have concern about mean value 'yy' that it should be obtained from approx. equal no. of data point. If there are not sufficient data points of 'y' in 'bin' then the mean value 'yy' should be NaN. Please can someone help in this regard. Thanks


Solution

  • Check for the number of 1s in bin for each iteration of your for-loop. If that number is below a certain threshold, assign NaN to yy:

    x=ABC(:,1);
    y=ABC(:,2);
    counter=1;
    
    nbinmin = 5; % this is the threshold
    
    for i=min(x):0.3:max(x)
        bin= x>i &  x<= i+0.3;
        xbin(counter,1)  = mean(x(bin));
    
        % check if the number of 1s in bin is less than the threshold
        if length(bin(bin==1)) < nbinmin
            yy(counter,1)    = NaN;
        else
            yy(counter,1)    = mean(y(bin));
        end
        counter = counter+1;
    end