Search code examples
pythonsizeastropyfits

Python opening large fits data buffer size error


I tried to open a big IDL-generated fits data cube (159,2,4096,4096):

In [37]: hdulist = fits.open('/randpath/randname1.fits')

In [38]: hdulist.info()
Filename: /randpath/randname1.fits
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU      11   (159, 2, 4096, 4096)   float32   

In [39]: scidata = hdulist[0].data

The following error occurred:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-d492d4e07eb1> in <module>()
----> 1 scidata = hdulist[0].data

/opt/local/anaconda/anaconda-2.2.0/lib/python2.7/site-packages/astropy/utils/decorators.py in __get__(self, obj, owner)
    513             return obj.__dict__[self._key]
    514         except KeyError:
--> 515             val = self.fget(obj)
    516             obj.__dict__[self._key] = val
    517             return val

/opt/local/anaconda/anaconda-2.2.0/lib/python2.7/site-packages/astropy/io/fits/hdu/image.py in data(self)
    206             return
    207 
--> 208         data = self._get_scaled_image_data(self._data_offset, self.shape)
    209         self._update_header_scale_info(data.dtype)
    210 

/opt/local/anaconda/anaconda-2.2.0/lib/python2.7/site-packages/astropy/io/fits/hdu/image.py in _get_scaled_image_data(self, offset, shape)
    619         code = BITPIX2DTYPE[self._orig_bitpix]
    620 
--> 621         raw_data = self._get_raw_data(shape, code, offset)
    622         raw_data.dtype = raw_data.dtype.newbyteorder('>')
    623 

/opt/local/anaconda/anaconda-2.2.0/lib/python2.7/site-packages/astropy/io/fits/hdu/base.py in _get_raw_data(self, shape, code, offset)
    566                               offset=offset)
    567         elif self._file:
--> 568             return self._file.readarray(offset=offset, dtype=code, shape=shape)
    569         else:
    570             return None

/opt/local/anaconda/anaconda-2.2.0/lib/python2.7/site-packages/astropy/io/fits/file.py in readarray(self, size, offset, dtype, shape)
    272 
    273             return np.ndarray(shape=shape, dtype=dtype, offset=offset,
--> 274                               buffer=self._mmap)
    275         else:
    276             count = reduce(lambda x, y: x * y, shape)

TypeError: buffer is too small for requested array

The averaged array (2,4096,4096) works well:

In [40]: hdulist2 = fits.open('/randpath/randname1avg.fits')

In [41]: hdulist2.info()
Filename: /randpath/randname1avg.fits
No.    Name         Type      Cards   Dimensions   Format
0    PRIMARY     PrimaryHDU      10   (2, 4096, 4096)   float32   

In [42]: scidata2 = hdulist2[0].data

Any ideas? It seems that size matters for some reason. MATLAB cannot open the first fits file either:

Warning: Seek failed, 'Offset is bad - after end-of-file or last character written.'.   File may be an
invalid FITS file or corrupt.  Output structure may not contain complete file information. 
> In fitsinfo>skipHduData (line 721)
  In fitsinfo (line 226)
  In fitsread (line 99) 
Error using fitsiolib
CFITSIO library error (108): error reading from FITS file

Error in matlab.io.fits.readImg (line 85)
imgdata = fitsiolib('read_subset',fptr,fpixel,lpixel,inc);

Error in fitsread>read_image_hdu (line 438)
    data = fits.readImg(fptr);

Error in fitsread (line 125)
        data = read_image_hdu(info,1,raw,pixelRegion);

IDL can, for unclear reasons. Is there a limit to the array size when running the astropy.io workflow? Random matrices of the same size can be generated without problems. I'm currently working on a machine with 256 GB RAM, so memory should not play at role, should it? Thanks for all the help!

Update: The first time the hdulist is loaded Python actually gives some more useful error message:

WARNING: File may have been truncated: actual file length (4160755136) is smaller than the expected size (21340624320) [astropy.io.fits.file] Indeed the file size is only 3.9 GB contrary to the expected ~ 20 GB. I'll have to double-check (don't have much experience with IDL) but for some reason it (writefits) cannot create the fits file properly.

Update 2: Problem solved. IDL 6.2 (an old version installed on the machine) apparently cannot deal with too large files, IDL 8.3 (which is also installed) can. No idea why, though.


Solution

  • The problem is related to IDL 6.2 and does not appear anymore in IDL 8.3. It can thus be avoided by using current IDL versions for generating the fits file.