I have the following regional economic data containing estimated output by sector up until 2100, for a variety of growth scenarios.
I want to sum the value of each sector to get a regional output estimate. This requires one to aggregate the output column repeatedly every 46 cells, placing the result in another column (preferably with other information attached too, such as year/scenario ID etc).
As there are >1,000,000 cells it's not a job that can be done by hand. What is the easiest way to sum the regional output across all sectors, by scenario, year and region?
Here's some example data:
scenario year region_id region_name sector_id sector_name value/output
1A_1C_1C 2000 1 London 2 Mining & quarrying 642.2357178
1A_1C_1C 2000 1 London 3 Food, drink & tobacco 1979.741211
1A_1C_1C 2000 1 London 4 Textiles etc 586.706604
1A_1C_1C 2000 1 London 5 Wood & paper 318.1535339
1A_1C_1C 2000 1 London 6 Printing & recording 1749.062012
...
1A_1C_1C 2000 2 South East 1 Agriculture etc 771.9697876
1A_1C_1C 2000 2 South East 2 Mining & quarrying 542.2153931
1A_1C_1C 2000 2 South East 3 Food, drink & tob. 1990.234009
1A_1C_1C 2000 2 South East 4 Textiles etc 205.3581085
EDIT: Can the results please be reported by taking the sum of the output for each region from column 7, and placing them in column 8. Ideally column 9 would then include the specific region name and column 10 would include the specific scenario variant, like this:
Output Region Scenario Year
XXXXXX London 1A_1C_1C 2000
XXXXXX South East 1A_1C_1C 2000
...
XXXXXX London 1B_1C_1A 2000
XXXXXX South East 1B_1C_1A 2000
...
EDIT: Add 'year' to output example
I worked out how to do the numerical calculation in Excel. Here's the formula:
=SUM(OFFSET($E$2,(ROW()-2)*46,0,46,1))
This sums all values in E2 downwards, every 46 rows using the OFFSET function. The trick is in row() which returns the row number, takes 2 from this number (as I started on C2), and then multiplies this by 46, to shift the SUM function down the column.