I have a raster file in WGS84 projection and I am trying to get the coordinates of random pixels within the raster GeoTIFF area down left in picture. At first, I calculate the coordinates of each pixel's centroid (in WGS84 again), then I pick 100 random of them and export them to a csv.
Problem: I expect points to be within the raster area (down left in picture) but they are way off of it. Is it a projection error or coordinates miscalculation? What is wrong in my code?
Here is the code
# Get coordinates for each pixel centroid
geotiff = gdal.Open(path)
gt = geotiff.GetGeoTransform()
column_numbers, row_numbers, band_numbers = geotiff.RasterXSize, geotiff.RasterYSize, geotiff.RasterCount
minx = gt[0]
miny = gt[3] + column_numbers*gt[4] + row_numbers*gt[5]
maxx = gt[0] + column_numbers*gt[1] + row_numbers*gt[2]
maxy = gt[3]
pixelWidth = gt[1]
pixelHeight = -gt[5]
lonPxSz = (maxy - miny) / row_numbers
latPxSz = (maxx - minx) / column_numbers
total = np.array(geotiff.ReadAsArray())
res = []
for i in range(row_numbers):
for j in range(column_numbers):
res.append([[i,j]] + [data[i][j] for data in total])
coords = pd.DataFrame(res, columns=['Pair', 'Col1', 'Col2', 'Col3', 'Col4', 'Col5', 'Col6'])
coords[['Lat', 'Lon']] = pd.DataFrame(coords['Pair'].tolist(), index=coords.index)
coords["Lat"] = (coords["Lat"] + 0.5) * 10 * latPxSz + miny
coords["Lon"] = (coords["Lon"] + 0.5) * 10 * lonPxSz + minx
coords = coords.sample(n = 100)
coords[['Lat', 'Lon']].to_csv("coords.csv", sep=";")
If you only want to pick 100 random points on the image:
from osgeo import gdal
import numpy as np
import pandas as pd
import random
path = "image.tif"
geotiff = gdal.Open(path)
gt = geotiff.GetGeoTransform()
column_numbers, row_numbers, band_numbers = geotiff.RasterXSize, geotiff.RasterYSize, geotiff.RasterCount
minx = gt[0]
miny = gt[3] + column_numbers * gt[4] + row_numbers * gt[5]
maxx = gt[0] + column_numbers * gt[1] + row_numbers * gt[2]
maxy = gt[3]
pixelWidth = gt[1]
pixelHeight = -gt[5]
halfPixelWidth = pixelWidth / 2
halfPixelHeight = pixelHeight / 2
rand_point_x = random.sample([i for i in range(column_numbers)], 100)
rand_point_y = random.sample([i for i in range(row_numbers)], 100)
rand_points = np.vstack((rand_point_y, rand_point_x)).T
coords = pd.DataFrame(rand_points, columns=['Lat', 'Lon'])
coords["Lat"] = miny + (coords["Lat"] * pixelHeight) + halfPixelHeight
coords["Lon"] = minx + (coords["Lon"] * pixelWidth) + halfPixelWidth
coords.to_csv("coords.csv", sep=',')
You may use the coordinates of these random points to retrieve pixel values afterward.