I have an array with states/regions in the UK. Some regions occur more than once in this list, so I have performed a COUNTIF
to determine the number of times that each one occur.
Now I need to run a QUERY
to list top 5 regions.
Generally, most occurrences are for the London area.
The issue is that in the regions there are 2 states that refer to the Greater London area - London and Greater London.
These two I need to merge and sum their values. There needs to be only one region - Greater London, and its value needs to hold the sum of London and Greater London.
This is the dataset I have:
+----------------+-------+
| State/Province | count |
+----------------+-------+
| Hampshire | 1 |
+----------------+-------+
| Kent | 2 |
+----------------+-------+
| West Lothian | 3 |
+----------------+-------+
| London | 4 |
+----------------+-------+
| Greater London | 5 |
+----------------+-------+
| Cheshire | 6 |
+----------------+-------+
I have managed to put together this QUERY
so far:
=QUERY(A1:B,"select A, max(B) group by A order by max(B) desc limit 5 label max(B) 'Number of occurrences'",1)
That gives me this output:
+----------------+-----------------------+
| State/Province | Number of occurrences |
+----------------+-----------------------+
| Cheshire | 6 |
+----------------+-----------------------+
| Greater London | 5 |
+----------------+-----------------------+
| London | 4 |
+----------------+-----------------------+
| West Lothian | 3 |
+----------------+-----------------------+
| Kent | 2 |
+----------------+-----------------------+
What I need is the Greater London and London entries to be merged under the name Greater London and their numbers of occurrences to be summed, providing this result:
+----------------+-----------------------+
| State/Province | Number of occurrences |
+----------------+-----------------------+
| Greater London | 9 |
+----------------+-----------------------+
| Cheshire | 6 |
+----------------+-----------------------+
| West Lothian | 3 |
+----------------+-----------------------+
| Kent | 2 |
+----------------+-----------------------+
| Hampshire | 1 |
+----------------+-----------------------+
Apologies for not sharing a sheet, but I have security restrictions that are not allowing me to share any link to sheet outside the firm.
=QUERY(QUERY(ARRAYFORMULA(
{SUBSTITUTE(IF(A1:A="London","♥",A1:A),"♥","Greater London"),B1:B}),
"select Col1, sum(Col2)
where Col1 is not null
group by Col1"),
"select Col1, max(Col2)
group by Col1
order by max(Col2) desc
limit 5
label max(Col2)'Number of occurrences'",1)