Search code examples
algorithmmathgraph

Choosing an attractive linear scale for a graph's Y Axis


I'm writing a bit of code to display a bar (or line) graph in our software. Everything's going fine. The thing that's got me stumped is labeling the Y axis.

The caller can tell me how finely they want the Y scale labeled, but I seem to be stuck on exactly what to label them in an "attractive" kind of way. I can't describe "attractive", and probably neither can you, but we know it when we see it, right?

So if the data points are:

   15, 234, 140, 65, 90

And the user asks for 10 labels on the Y axis, a little bit of finagling with paper and pencil comes up with:

  0, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250

So there's 10 there (not including 0), the last one extends just beyond the highest value (234 < 250), and it's a "nice" increment of 25 each. If they asked for 8 labels, an increment of 30 would have looked nice:

  0, 30, 60, 90, 120, 150, 180, 210, 240

Nine would have been tricky. Maybe just have used either 8 or 10 and call it close enough would be okay. And what to do when some of the points are negative?

I can see Excel tackles this problem nicely.

Does anyone know a general-purpose algorithm (even some brute force is okay) for solving this? I don't have to do it quickly, but it should look nice.


Solution

  • A long time ago I have written a graph module that covered this nicely. Digging in the grey mass gets the following:

    • Determine lower and upper bound of the data. (Beware of the special case where lower bound = upper bound!
    • Divide range into the required amount of ticks.
    • Round the tick range up into nice amounts.
    • Adjust the lower and upper bound accordingly.

    Lets take your example:

    15, 234, 140, 65, 90 with 10 ticks
    
    1. lower bound = 15
    2. upper bound = 234
    3. range = 234-15 = 219
    4. tick range = 21.9. This should be 25.0
    5. new lower bound = 25 * round(15/25) = 0
    6. new upper bound = 25 * round(1+235/25) = 250

    So the range = 0,25,50,...,225,250

    You can get the nice tick range with the following steps:

    1. divide by 10^x such that the result lies between 0.1 and 1.0 (including 0.1 excluding 1).
    2. translate accordingly:
      • 0.1 -> 0.1
      • <= 0.2 -> 0.2
      • <= 0.25 -> 0.25
      • <= 0.3 -> 0.3
      • <= 0.4 -> 0.4
      • <= 0.5 -> 0.5
      • <= 0.6 -> 0.6
      • <= 0.7 -> 0.7
      • <= 0.75 -> 0.75
      • <= 0.8 -> 0.8
      • <= 0.9 -> 0.9
      • <= 1.0 -> 1.0
    3. multiply by 10^x.

    In this case, 21.9 is divided by 10^2 to get 0.219. This is <= 0.25 so we now have 0.25. Multiplied by 10^2 this gives 25.

    Lets take a look at the same example with 8 ticks:

    15, 234, 140, 65, 90 with 8 ticks
    
    1. lower bound = 15
    2. upper bound = 234
    3. range = 234-15 = 219
    4. tick range = 27.375
      1. Divide by 10^2 for 0.27375, translates to 0.3, which gives (multiplied by 10^2) 30.
    5. new lower bound = 30 * round(15/30) = 0
    6. new upper bound = 30 * round(1+235/30) = 240

    Which give the result you requested ;-).

    ------ Added by KD ------

    Here's code that achieves this algorithm without using lookup tables, etc...:

    double range = ...;
    int tickCount = ...;
    double unroundedTickSize = range/(tickCount-1);
    double x = Math.ceil(Math.log10(unroundedTickSize)-1);
    double pow10x = Math.pow(10, x);
    double roundedTickRange = Math.ceil(unroundedTickSize / pow10x) * pow10x;
    return roundedTickRange;
    

    Generally speaking, the number of ticks includes the bottom tick, so the actual y-axis segments are one less than the number of ticks.