I'd like to distinguish between None
and empty strings (''
) when going back and forth between Python data structure and csv representation using Python's csv
module.
My issue is that when I run:
import csv, cStringIO
data = [['NULL/None value',None],
['empty string','']]
f = cStringIO.StringIO()
csv.writer(f).writerows(data)
f = cStringIO.StringIO(f.getvalue())
data2 = [e for e in csv.reader(f)]
print "input : ", data
print "output: ", data2
I get the following output:
input : [['NULL/None value', None], ['empty string', '']]
output: [['NULL/None value', ''], ['empty string', '']]
Of course, I could play with data
and data2
to distinguish None
and empty strings with things like:
data = [d if d!=None else 'None' for d in data]
data2 = [d if d!='None' else None for d in data2]
But that would partly defeat my interest of the csv
module (quick deserialization/serialization implemented in C, specially when you are dealing with large lists).
Is there a csv.Dialect
or parameters to csv.writer
and csv.reader
that would enable them to distinguish between ''
and None
in this use-case?
If not, would there be an interest in implementing a patch to csv.writer
to enable this kind of back and forth? (Possibly a Dialect.None_translate_to
parameter defaulting to ''
to ensure backward compatibility.)
The documentation suggests that what you want is not possible:
To make it as easy as possible to interface with modules which implement the DB API, the value None is written as the empty string.
This is in the documentation for the writer
class, suggesting it is true for all dialects and is an intrinsic limitation of the csv module.
I for one would support changing this (along with various other limitations of the csv module), but it may be that people would want to offload this sort of work into a different library, and keep the CSV module simple (or at least as simple as it is).
If you need more powerful file-reading capabilities, you might want to look at the CSV reading functions in numpy, scipy, and pandas, which as I recall have more options.