I need to iterate over a GEE ImageCollection. However, I do NOT need, or want, to download the images. I only need the image coordinates, which can be found in the image Metadata.
I have not found a way of looping through the images in an ImageCollection, as it says it is not subscriptable. A lot of the online resources are about downloading the images. I need to iterate over hundreds of images, so downloading them first seems really inefficient when I don't need the images themselves. I am new to GEE and am using the Python API.
For example:
import ee
import pandas as pd
# Filter by date, region of interest and cloud cover
collection = ee.ImageCollection("LANDSAT/LC08/C02/T2")\
.filterDate('2021-03-01','2021-03-31')\
.filterBounds(roi)\
.filterMetadata('CLOUD_COVER', 'less_than', 10)\
.sort("CLOUD_COVER")
# Loop through the files in the image collection and extract the coordinates from the metadata
# Save them as a pandas dataframe
for files in collection[0:5]:
coordinates = files.get('system:footprint').getInfo()
coords_df = pd.DataFrame.from_dict(coordinates)
coords_only = coords_df['coordinates']
print(coords_only)
I do not know if this is possible. Thanks.
Try this:
collection \
.toList(5) \
.map(lambda image: ee.Image(image).get('system:footprint')) \
.getInfo()
toList(5)
iterates over the collection as a server-side operation and turns the first 5 images (or features, if it's a feature collection) into a list that you can retrieve with getInfo()
. Then we narrow to the exact data we want by map()
ing the list.
You'll get a Python list of GeoJSON geometries, which you can then iterate with your Python for
.