I'm trying to read a csv file. The issue is that it is too large and I have had to use an error handler. Within the error handler, I have to call csv.field_size_limit()
. Which does not work even by itself as I keep receiving a 'limit must be an integer' error. From further research, I have found that this is probably an install error. I've installed all third party tools using the Package Manager so I am not sure what could be going wrong. Any ideas about how to correct this issue?
import sys
import csv
maxInt = sys.maxsize
decrement = True
while decrement:
decrement = False
try:
csv.field_size_limit(maxInt)
except OverflowError:
maxInt = int(maxInt/10)
decrement = True
with open("Data.csv", 'rb') as textfile:
text = csv.reader(textfile, delimiter=" ", quotechar='|')
for line in text:
print ' '.join(line)
Short answer: I am guessing that you are on 64-bit Windows. If so, then try using sys.maxint
instead of sys.maxsize
. Actually, you will probably still run into problems because I think that csv.field_size_limit()
is going to try to preallocate memory of that size. You really want to estimate the actual field size that you need and maybe double it. Both sys.maxint
and sys.maxsize
are much too big for this.
Long explanation: Python int
objects store C long
integers. On all relevant 32-bit platforms, both the size of a pointer or memory offset and C long
integers are 32-bits. On most UNIXy 64-bit platforms, both the size of a pointer or memory offset and C long
integers are 64-bits. However, 64-bits Windows decided to keep C long
integers 32-bits while bumping up the pointer size to 64-bits. sys.maxint
represents the biggest Python int
(and thus C long
) while sys.maxsize
is the biggest memory offset. Consequently, on 64-bit Windows, sys.maxsize
is a Python long
integer because the Python int
type cannot hold a number of that size. I suspect that csv.field_size_limit()
actually requires a number that fits into a bona fide Python int
object. That's why you get the OverflowError
and the limit must be an integer
errors.