Let's say I have a list of list contains tabs character:
mylist = [['line 1', '<a href="//<% serverNames[0].getHostname() %>:'],
['line 2', ' <% master.getConfiguration()>']]
When I save the list into CSV
file, the tab
in the code at line 2 will be written \t
.
line | code
-----------------------------------------------------
1 | <a href="//<% serverNames[0].getHostname() %>:
2 | \t <% master.getConfiguration()>
I need this as it is because I want to compare the code with other lists. So, I don't want to replace the tab with other characters such as spaces.
The code I have written:
with open('codelist.csv', 'w') as file:
header = ['line','code']
writers = csv.writer(file)
writers.writerow(header)
for row in mylist:
writers.writerow(row)
How to solve this kind of problem?
I can't reproduce the exact error in either Python2 or Python3 but I have a guess about what might be going on.
According to the documentation for csv.writer
, located here,
All other non-string data are stringified with str() before being written.
Note moreover that the python str
function induces precisely the behavior you describe if you supply a string containing an actual tab character:
>>> str(' ')
'\t'
Of course, what you have is string data, but, but the documentation above doesn't really say what other means. Here's what I found in the implementation of writerows
in _csv.c
, located here:
if (PyUnicode_Check(field)) {
append_ok = join_append(self, field, quoted);
Py_DECREF(field);
}
else if (field == Py_None) {
append_ok = join_append(self, NULL, quoted);
Py_DECREF(field);
}
else {
PyObject *str;
str = PyObject_Str(field);
Py_DECREF(field);
if (str == NULL) {
Py_DECREF(iter);
return NULL;
}
append_ok = join_append(self, str, quoted);
Py_DECREF(str);
}
So I suspect what's going on here is that somehow your list contains string data in a format that's not recognized as a unicode string, and which consequently fails the PyUnicode_Check
branch in the test, gets sent through str
(referred to as PyObject_Str
in the C code), and consequently gets the escape sequence embedded.
So you might want to check how that data is getting into your lists.
Alternatively, maybe the source I'm looking at there doesn't correspond to the version of Python you're using, and you're using a version that, say, just runs everything through str
.