Search code examples
excelapache-poixlsx

Excel Defined names and cell evaluation order causes unexpected behavior


I generate an excell sheet using Apache POI including 3 defined names used as the data for a graph.

This all works very nice except in a wierd edge case.

I have a column Categories where the user can enter text to categorize related data. In my example there are 5 rows and if the user inputs two different categories, no more and no less on the 5 rows then this anomaly happens for a few combinations but not all otherwise everything works as expected.

The Name NumberOfCat counts the number of different categories in the categories column (starting at F13 and ending at F17).

NumberOfCat = SUMPRODUCT(1/COUNTIF(Data!$F$13:$F$17;Data!$F$13:$F$17))

Then there is the name CategoryNames that should list all categories H13 - H17 contain working formulas so as to list one of the categores on every row or "" of there are no more unique categories.

CategoryNames = OFFSET(Data!$H$13;0;0;NumberOfCat;1)

If I enter the categories enter image description here everything works. If I change it to enter image description here It does not.

Now for some reason entering CategoryNames into a cell evaluates to only "Kat1" instead of "Kat1, Kat2". This while entering NumberOfCat into a cell still evaluates to 2 as it should.

Entering the CategoryNames formula into a cell and replacing NumberOfCat with 2 generates the correct result.

I suspect some very weird execution order is going on but have no idea how to proceed. I'm using Microsoft® Excel® for Microsoft 365 MSO (Version 2404 Build 16.0.17531.20152) 64-bit and the error reproduces on Excel for the web. If you want to have a look at the example file I can provide it. If any more context is needed please ask.


Solution

  • This seems to be a floating point problem in Excel. Especially the applying of the Excel rule, to take only 15 significant digits in floating point.

    =SUMPRODUCT(1/COUNTIF(Data!$F$13:$F$17;Data!$F$13:$F$17)) evaluates to exact 2.0 when the first three of the five cells in Data!$F$13:$F$17 contain the one equal value and the next two cells contain the other equal value. That is 1/3 + 1/3 + 1/3 + 1/2 + 1/2, which is 0.3333333333333333 + 0.3333333333333333 + 0.3333333333333333 + 0.5 + 0.5, which is 0.9999999999999999 + 1.0. Here Excel seems to apply it's rule, to take only 15 significant digits in floating point, to the sum of the first three summands first and thus stores the rounded value 1.0 + 1.0 = 2.0.

    Same formula evaluates to 1.9999999999999998 when the first two of the five cells in Data!$F$13:$F$17 contain the one equal value and the next three cells contain the other equal value. That is 1/2 + 1/2 + 1/3 + 1/3 + 1/3, which is 0.5 + 0.5 + 0.3333333333333333 + 0.3333333333333333 + 0.3333333333333333, which is 1.0 + 0.9999999999999999. Here Excel seems to apply it's rule, to take only 15 significant digits in floating point, only after the entire calculation. Thus it stores not rounded value 1.9999999999999998.

    When the result of NumberOfCat is not exactly 2, the spilling behavior of =OFFSET(Data!$H$13;0;0;NumberOfCat;1) appears to be different from the spilling behavior when the result of NumberOfCat is exactly 2.

    For me NumberOfCat being =ROUND(SUMPRODUCT(1/COUNTIF(Data!$F$13:$F$17;Data!$F$13:$F$17));0) works in combiantion with CategoryNames being =OFFSET(Data!$H$13;0;0;NumberOfCat;1).

    But as you are using Excel 365, there is UNIQUE function to get unique values from a cell range. NumberOfCat being =COUNTA(UNIQUE(Data!$F$13:$F$17)) will also work.