Search code examples
google-cloud-storagegdal

GDAL: reading rasters from a private google cloud bucket


I want to read raster datasets from a private google cloud storage bucket through the gs:// URIs using GDAL. According to the VSI documentation one way of authenticating is GS_SECRET_ACCESS_KEY and GS_ACCESS_KEY_ID.

I have these credentials for a service account that has access to the target bucket, but when setting them as environment variables I get a 403 response.

Further more, trying to access a public dataset like

gdalinfo /vsigs/gcp-public-data-sentinel-2/tiles/32/V/LQ/S2B_MSIL1C_20181108T111249_N0207_R137_T32VLQ_20181108T113514.SAFE/MTD_MSIL1C.xml

also returns a 403 error with the credentials set up like this.

When going though the motion with gsutil config -a I'm required to specify the project-id, which I don't have for these credentials.

I've tried using different credentials where I know the project-id and everything seems to suddenly work.

The only relevant discussions I've found ([1], [2]) don't mention anything about the project-id.

So, my questions are

  • using a credential setup like this, is specifying the corresponding project-id a mandatory requirement?

  • is there a way to set these credentials in env and not in .boto?


Solution

  • Looks like there was a mix-up with the credentials. Without (valid) credentials, reading public datasets is also not possible.

    One way of authenticating is using a valid GS_ACCESS_KEY_ID + GS_SECRET_ACCESS_KEY combination for a service account with sufficient permissions.

    These can be set as environment variables or stored in the .boto file.

    If stored in the .boto file, this can be done manually

    [Credentials]
    gs_access_key_id=<YOURKEY>
    gs_secret_access_key=<YOURSECRET>
    

    or by running gsutil config -a ich which case you will be asked for a project-id. But at least for this purpose, it doesn't seem to make a difference if this id is correct as long as one inputs a something.