Search code examples
pythonloopsgoogle-earth-engine

Iterating over GEE ImageCollection Metadata


I need to iterate over a GEE ImageCollection. However, I do NOT need, or want, to download the images. I only need the image coordinates, which can be found in the image Metadata.

I have not found a way of looping through the images in an ImageCollection, as it says it is not subscriptable. A lot of the online resources are about downloading the images. I need to iterate over hundreds of images, so downloading them first seems really inefficient when I don't need the images themselves. I am new to GEE and am using the Python API.

For example:

import ee
import pandas as pd

# Filter by date, region of interest and cloud cover

collection = ee.ImageCollection("LANDSAT/LC08/C02/T2")\
    .filterDate('2021-03-01','2021-03-31')\
    .filterBounds(roi)\
    .filterMetadata('CLOUD_COVER', 'less_than', 10)\
    .sort("CLOUD_COVER")

# Loop through the files in the image collection and extract the coordinates from the metadata 
# Save them as a pandas dataframe

for files in collection[0:5]:
    coordinates = files.get('system:footprint').getInfo()
    coords_df = pd.DataFrame.from_dict(coordinates)
    coords_only = coords_df['coordinates']
    print(coords_only)

I do not know if this is possible. Thanks.


Solution

  • Try this:

    collection \
        .toList(5) \
        .map(lambda image: ee.Image(image).get('system:footprint')) \
        .getInfo()
    

    toList(5) iterates over the collection as a server-side operation and turns the first 5 images (or features, if it's a feature collection) into a list that you can retrieve with getInfo(). Then we narrow to the exact data we want by map()ing the list.

    You'll get a Python list of GeoJSON geometries, which you can then iterate with your Python for.