Search code examples
excelexcel-2007

Excel consolidate yields incorrect list of unique values


I have two columns that look like this:

Make/Model      Parc
Abarth*500      15
Abarth*500      19
Abarth*500      2
Abarth*500      14
Abarth*500      15
Abarth*500      25
Abarth*500      20
Abarth*500      17
Abarth*500C     12
Abarth*500C     16
Abarth*500C     23
Abarth*500C     18
Abarth*500C     1
Alfa Romeo*145  405
Alfa Romeo*145  431
Alfa Romeo*146  36
Alfa Romeo*146  80
Alfa Romeo*146  121
Alfa Romeo*146  72
Alfa Romeo*146  39

etc...

The sum of the values in the "Parc" column is 10,475,387.

I perform a consolidate on this table to remove the repetitions in "Make/Model" and sum the values in the "Parc" column for each "Make/Model."

enter image description here

Once I perform the consolidation, everything looks as expected:

Make/Model      Parc
Abarth*500      127
Abarth*500C     70
Alfa Romeo*145  836
Alfa Romeo*146  348
Alfa Romeo*147  3848
Alfa Romeo*155  29
Alfa Romeo*156  3148
Alfa Romeo*159  573
Alfa Romeo*164  111

etc...

So what's the problem? The new sum of the "Parc" column is 10,554,076! Verification against an alternate method indicates that the column of unique labels is incorrect in seemingly random places. I also do not understand how the total sum could have gone up. I have over 18,000 rows in this dataset so individually checking all of the data would be very time consuming.

You can find the full dataset here: Excel File

What have I tried?

I have tried removing all formatting. Also, these numbers are all integers and the option to "Automatically insert a decimal point" is not checked. I have also tried creating a new column of =INT() values to ensure they are all integers. My results are exactly the same in all cases.

I posted this question yesterday and it was put on hold because it could not be reproduced without the full dataset: Excel consolidate sum yields different column total

EDIT: As Mauricio Noris Freire has shown, you can create a unique item column using advanced filter, and then use SUMIF to solve this problem correctly. I still would like to know why the Excel data tool consolidate does not work as expected.


Solution

  • Your observations are correct. Both Consolidate and SUMIFS are being affected by the asterisks in your data. If you remove the asterisks, or, as I did, replace them with a pipe | character, your totals will match.

    You could also get a correct report by using a Pivot Table

    Examine, for example the output for the Mercedes*190

    In the raw data, we show these lines if we filter for that value:

    enter image description here

    In the Consolidation report, we see that the total for Mercedes*190 will include both that car, and also the Mercedes*A190. In other words the A190 is being counted twice, because of the wild card issue.

    enter image description here

    I suspect this is some kind of design flaw.