I know that starting version 2.2, gdal
has virtual filesystem drivers, like /vsizip/
for accessing zip
archives.
Previously I've been able to read from zip
archives, but I haven't manage to write to them (which should also be possible I guess). I didn't find much documentation about this either.
This is what I tried:
import gdal
from zipfile import ZipFile
import numpy as np
with ZipFile('test.zip','a') as zfile:
driver = gdal.GetDriverByName('GTiff')
raster = driver.Create('/vsizip/test.zip/testraster.tif',10,10, 1, gdal.GDT_Float32)
raster.GetRasterBand(1).writeArray(np.random.random(10,10))
raster = None
But I'm always receiving
AttributeError: 'NoneType' object has no attribute 'GetRasterBand'
EDIT:
Updated the snipped as per Gabriella Giordano's suggestion.
EDIT 2:
Gabriella pointed out that the GTiff
driver needs both read and write access at the same time, which is (currently?) not supported.
The suggested workaround is creating a temporary file and then copy it into the archive.
I'm really interested into directly creating a raster file (doesn't have to be a tiff) inside the archive. If I wanted to copy them, I might as well use zipfile
or the like.
Any ideas?
EDIT 3:
Full disclosure*, this is what I'm trying to achieve:
I have MODIS hdf
files, and I want to directly save the value raster subdatasets into a zip archive.
Solution:
I've finally managed to do it with the excellent insight from Gabriella.
I ended up using gdal.Translate
with this passed in to options
:
gdal.TranslateOptions(creationOptions=['STREAMABLE_OUTPUT=TRUE'])
*I did't think this was pertinent information, but Gabriela's answer makes me think it is.
I'm not really fluent in python, but the dataset creation is probably failing because you are not setting the number of bands.
Try this:
raster = driver.Create('/vsizip/test.zip/testraster.tif',10,10, 1, gdal.GDT_Float32)
Change 1 with the desired number of bands.
You can find the complete list of arguments of Create here.
EDIT:
Found the possible cause of the issue (and a possible workaround) here. As reported at this link, apparently the open mode of Create is incompatible with the vsizip virtual filesystem.
Edit 2
This is not a problem of the driver you use (e.g. GeoTiff), but a problem of the virtual file system itself. The Create call tries to open a file in both read/write mode because GDAL uses the file system as a cache for performance reasons, not only for storage. But in this case, the virtual vsizip file system does not allow to read and write from a file at the same time, because it does not support random access.
The vsizip vfs indeed provides on-the-fly reads and writes separately, because it compress/decompress data for you in the background. This means that you cannot jump back and forth (as in random access mode requested for read/write open mode) because what you write on disk (compressed data) is different from what you could eventually read (uncompressed data).
Edit 3
I've found out that gdal command line tools (like gdal_translate or gdal_warp) set the 'STREAMABLE_OUTPUT = TRUE' automagically when working with virtual file systems. A streamable file format enforces a Write only policy that should comply with non-random access file systems. You can set this option as an additional parameter for the Create call. Again my limited python capabilities prevent me from providing a working snippet.
Please consider though, that steamable also implies some limitation on the kind of operations that you can perform on the dataset, so this is probably not going to fix your problem, but it is useful to know anyway...