Search code examples
javascriptchartsd3.jsbigdatadate-arithmetic

How to reduce fragmentation data in d3?


I've a lot of values and want to visualize the fragmentation of the sum.

So for example I have the values 1, 1, 1, 2, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 7, 8, 8, 10, and so the sum 86. Now I wanna visualize how this 86 is build from the single values. The target is to see how its build, moar from many small, or from less big values.

In reality I have about 20000 values.

Soo, what I want is an area or line diagramm, where the x axis is linear from the smalles to the biggest number in the value set (so 1 to 10 in this example) and the y axis represents the part of the sum that is made by the values at this size.

For 1 to 10 it would be easy to simple make a bar for each number like this:

              #                
              #                
              #                
           #  #        #       
           #  #        #       
           #  #        #       
           #  #        #       
           #  #  #     #       
           #  #  #     #       
           #  #  #     #     # 
           #  #  #     #     # 
           #  #  #     #     # 
           #  #  #     #     # 
           #  #  #  #  #     # 
           #  #  #  #  #     # 
           #  #  #  #  #     # 
           #  #  #  #  #     # 
  #        #  #  #  #  #     # 
  #  #     #  #  #  #  #     # 
  #  #     #  #  #  #  #     # 
  1  2  3  4  5  6  7  8  9  10

But in my case the x axis is linear, and I have all kind of values from 10 to 100000.

So I have moar values and a bigger x-scale then pixels in the width. What is now the best approach to calculate this diagram. My question is not how to actual paint the diagram, but how to reduce the values?

I could just take each pixel on the x axis, get the corresponding values of my data, calculate the sum and paint a line. but it seems to be both, inefficient and unelegant! It also could result in a hard break, if I have two columns with very big values and a single pixe-width column with no data. It would be nice to have a way to visualize it moar "flowing".

So is there a better way to calculate my diagram? I think my Idee would deform the diagram. Is there a way to prevent this? And how can I add the "flow" between the values?

Thanks for help!


Solution

  • You could split your data in intervals. Instead of showing a column for each value, show a column for each interval. Set the number of intervals depending on how much size you have available.

    Rough idea:

    var intervalCount = 20;
    
    var myValues = [ 0,1,1,2,500,5000,10000,10001,10002, 10002];
    
    var min = Math.min.apply(null, myValues);
    var intervalSize = (Math.max.apply(null, myValues) - min)/intervalCount;
    
    var myUpdatedValues = [];
    for (var i=0; i<intervalCount; i++) myUpdatedValues.push(0);
    
    myValues.forEach(function(value){
      myUpdatedValues[Math.floor((value-min)/intervalSize)]++;
    });