Search code examples
pythonapisentinelsentinelsat

Invalid Checksum with Python API sentinelsat. Using GEOPANDAS and Geojson


I'm trying to find a way to automatically download satellite imagery to create a database. I'm figuring this out and started with sentinelsat API, there isn't much about it out there due to its specificity. My steps where: I used this site https://geojson.io in order to select a polygon, and download a geojson file. I added that to a geodataframe (geopandas) apparently for no reason whatsoever actually. Followingly I used the polygon to add it to the query (api.query), got the products, looped through them, checked if they were online and tried downloading them, but I get the following error (for all of the online ones):

0  POLYGON ((-53.27854 -24.97081, -53.30223 -24.9...
Querying products: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 362/362 [00:06<00:00, 42.74product/s]
                                                                                  title  ...                                           geometry
f8184c3e-6760-461c-af2b-5cb3183f864d  S2A_MSIL2A_20211126T134211_N0301_R124_T22JBT_2...  ...  MULTIPOLYGON (((-52.89090 -25.39290, -52.87585...
f2a70ef8-b6ba-4612-8f3a-b4ec4d60ebc3  S2A_MSIL2A_20211126T134211_N0301_R124_T21JZN_2...  ...  MULTIPOLYGON (((-54.01898 -25.37484, -52.92975...
47d725db-7753-4407-9d67-90ec12053fac  S2B_MSIL2A_20211121T134209_N0301_R124_T21JZN_2...  ...  MULTIPOLYGON (((-54.01898 -25.37484, -52.92975...
b3a96909-c276-498e-ad77-4fc89675fa19  S2B_MSIL2A_20211121T134209_N0301_R124_T22JBT_2...  ...  MULTIPOLYGON (((-52.89090 -25.39290, -52.87585...
6ddcee95-edc5-4a33-b5a3-820dbbee3431  S2A_MSIL2A_20211116T134211_N0301_R124_T22JBT_2...  ...  MULTIPOLYGON (((-52.89090 -25.39290, -52.87585...
...                                                                                 ...  ...                                                ...
b5b00501-8c99-4a89-87c1-dc99c421cbd7  S2B_MSIL2A_20190615T134219_N0212_R124_T21JZN_2...  ...  MULTIPOLYGON (((-54.01898 -25.37484, -52.92975...
1cb316a6-025a-40e6-a051-6b3eb13a91d4  S2A_MSIL2A_20190610T134211_N0212_R124_T22JBT_2...  ...  MULTIPOLYGON (((-52.89090 -25.39290, -52.87585...
3b6181b8-459c-4a04-aa9e-ad8e1329a5e3  S2A_MSIL2A_20190610T134211_N0212_R124_T21JZN_2...  ...  MULTIPOLYGON (((-54.01898 -25.37484, -52.92975...
f8dddff3-e1da-4d7c-ae85-aa8aced8e337  S2B_MSIL2A_20190605T134219_N0212_R124_T21JZN_2...  ...  MULTIPOLYGON (((-54.01898 -25.37484, -52.92975...
d75470e8-bd80-42d0-9129-41db94aa292f  S2B_MSIL2A_20190605T134219_N0212_R124_T22JBT_2...  ...  MULTIPOLYGON (((-52.89090 -25.39290, -52.87585...

[362 rows x 41 columns]
Product ' 0 '
Product ' 1 '
Product: f2a70ef8-b6ba-4612-8f3a-b4ec4d60ebc3  is online.
Downloading S2A_MSIL2A_20211126T134211_N0301_R124_T21JZN_20211126T160417.zip:   0%|                                                                                                                 | 0.00/1.19G [00:00<?, ?B/s]
Traceback (most recent call last):
  File "c:\Users\phzoz\PythonProjects\Sentinel\main.py", line 32, in <module>
    api.download(product, directory_path="Data", checksum=True)
  File "C:\Users\phzoz\anaconda3\envs\sentinel\lib\site-packages\sentinelsat\sentinel.py", line 590, in download
    return downloader.download(id, directory_path)
  File "C:\Users\phzoz\anaconda3\envs\sentinel\lib\site-packages\sentinelsat\download.py", line 150, in download
    self._download_common(product_info, path, stop_event)
  File "C:\Users\phzoz\anaconda3\envs\sentinel\lib\site-packages\sentinelsat\download.py", line 229, in _download_common
    raise InvalidChecksumError("File corrupt: checksums do not match")
sentinelsat.exceptions.InvalidChecksumError: File corrupt: checksums do not match

Here's the code summary:

api = SentinelAPI(user, password, "https://scihub.copernicus.eu/dhus")

gjPath = "Data/files_geojson/map.geojson"

gdf = geopandas.read_file(gjPath)

print(gdf)

footprint = None
for i in gdf["geometry"]:
    footprint = i
#("20190601", "20190626")
products = api.query(footprint, date=("20190601", "20211201"), platformname="Sentinel-2", processinglevel="Level-2A")

productsGDF = api.to_geodataframe(products)

print(productsGDF)

for i, product in enumerate(products):
    print("Product '", i, "'")
    if api.is_online(product):
        print("Product:", str(product), " is online.")
        api.download(product, directory_path="Data", checksum=True)

I don't understand if I need to guide the API to make a request or something, for the product be really available, but the documentation seems to say you can just go for it, plus I don't really know what checksums mean, apparently some sort of transfer verification.


Solution

  • Apparently problem is solved, continued some days later with different usage of json file, (without doing the geodataframe part) and the download with checksum download normally, I did have to wait days tho after the first initial change (because after it still downloded at slow speeds, just that with checksum enabled), I suppose it was both a problem of the way I used the data&api, plus some connection problems at the time.