Search code examples
pythonbrightway

Brightway2 - Unlinked and missing cfs when importing Simapro LCIA methods


I am importing Swiss building database UVEK's LCIA methods to Brightway with SimaProLCIACSVImporter()

Code:

lcia = SimaProLCIACSVImporter(
    "C:\\Users\\...\\UVEK_Simapro_LCIA_2022.CSV",
    biosphere="biosphere3"
)
lcia.apply_strategies()
lcia.statistics()
print("size biosphere3: {0}".format(str(len(Database("biosphere3")))))

Results:

Extracted 34 methods in 0.91 seconds
Applying strategy: normalize_units
Applying strategy: set_biosphere_type
Applying strategy: normalize_simapro_biosphere_categories
Applying strategy: normalize_simapro_biosphere_names
Applying strategy: set_biosphere_type
Applying strategy: drop_unspecified_subcategories
Applying strategy: normalize_biosphere_categories
Applying strategy: normalize_biosphere_names
Applying strategy: link_iterable_by_fields
Applying strategy: match_subcategories
Applied 10 strategies in 0.87 seconds
34 methods
18229 cfs
14312 unlinked cfs

size biosphere3: 4427

I then use add_missing_cfs() with the idea to add the missing flows to the biosphere3 database (in order to easily import the LCI datasets built over those flows).

Code:

lcia.add_missing_cfs()
lcia.statistics()
print("size biosphere3: {0}".format(str(len(Database("biosphere3")))))

Results:

Vacuuming database 
Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01
Title: Writing activities to SQLite3 database:
  Started: 06/13/2022 12:23:24
  Finished: 06/13/2022 12:23:25
  Total time elapsed: 00:00:01
  CPU %: 79.20
  Memory %: 1.85
Added 7156 new biosphere flows
34 methods
18229 cfs
14312 unlinked cfs

size biosphere3: 11583

The results shows that the number of unlinked flows is unchanged (~14000). New flows have been added to the database (~7000) but it doesn't equal the number of unlinked cfs. Maybe I misunderstood unlinked flows and missing cfs...

Questions:

What is the relation between biosphere flows, unlinked cfs and missing cfs that have been added to the biosphere db ?

What is the best way to "complete" the biosphere3 db with the missing flows defined in the imported LCIA methods in order to have all the cfs linked ?


Solution

  • An excellent question, but unfortunately not one with an easy answer. This is something I am looking into, but it will take some time, as it needs to be done correctly.

    You probably already know this, but in case you don't - what is matching? We need to link the text attributes which identify a product, flow, or activity, with an object in our relational database. In theory, these attributes should match, and our job is easy. It becomes harder when people use inconsistent or incorrect attributes for what are supposed to be the same objects.

    Different players in the LCA world do their best to make their data and software easy to use, but sometimes this means that the different players make changes to things like names, location identifiers, etc. Moreover, there are different starting lists of names.

    The default data in Brightway (what gets installed when you call bw2io.bw2setup() is from ecoinvent version 3.8. This isn't "correct", it is just a default. The database biosphere3 is from ecoinvent version 3. But this isn't the same as UVEK, which is based on ecoinvent version 2.

    The UVEK database is self-contained and internally consistent, and its LCIA method characterization factors should match the flow names of the UVEK database itself (at least as long as they come from the same source, e.g. SimaPro CSV export). So the best way to use this LCI/LCIA in Brightway would be to use these data in their own set of Brightway databases.

    There will be a project to natively implement UVEK and its LCIA factors in Brightway, but this will only happen by the end of July (at the earliest).