Is it possible to write data line by line using gdal's WriteArray, rather than creating and feeding it an entire array?
I've run into a MemoryError
when creating an numpy array of size (50539,98357). I suppose I could get around this by using PyTables, but I'd rather not complicate things
>>> import numpy
>>> cols = 50539
>>> rows = 98357
>>> a1 = np.zeros((cols,rows))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'np' is not defined
>>> import numpy as np
>>> a1 = np.zeros((cols,rows))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
Update: I ended up using Abudis's solution combined with a sparse matrix in which I saved the points, pulling out each row as a "dense" or standard matrix. Comments welcome.
dataset = driver.Create(filename, cols, rows, number_of_bands, band_type)
band = dataset.GetRasterBand(1)
offset = 1 # i.e. the number of rows to write with each iteration
# values, pixel_x, and pixel_y defined earlier in my script
data = scipy.sparse.csr_matrix((values,(pixel_y,pixel_x)),dtype=float)
# iterate data writing for each row in data sparse array
for i in range(data.shape[0]):
data_row = data[i,:].todense() # output row of sparse array as standard array
band.WriteArray(data_row,0,offset*i)
band.SetNoDataValue(NULL_VALUE)
band.FlushCache()
Not really a gdal
expert here, but looks like this worked. So, the trick is to use yoff
parameter of WriteArray method. This way it is possible to write data to file in chunks. We basically just set the offset while writing next chunk of data.
import numpy as np
import gdal
cols = 50539
rows = 10000
offset = 1000
dst_filename = 'test.tif'
format = 'GTiff'
driver = gdal.GetDriverByName(format)
dst_ds = driver.Create(dst_filename, cols, rows, 1, gdal.GDT_Byte)
for i in range(10):
# generate random integers from 1 to 10
a = np.random.random_integers(1, 10, size=(offset, cols))
# write data to band 1
dst_ds.GetRasterBand(1).WriteArray(a, 0, offset * i)
dst_ds = None
This yields a 50539 by 10000 pixels tiff-file. If you really need 98357 rows, I guess you can just set offset
value to 1 and rows
to 98357. Theoretically it should work, because it worked on smaller array (10 by 20 pixels).
EDIT: Don't forget to change range(10)
to something like range(98357)
or xrange(98357)