Search code examples
pythoncsvexport-to-csvquotes

Convert tuple with quotes to csv like string


How to convert tuple

text = ('John', '"n"', '"ABC 123\nDEF, 456GH\nijKl"\r\n', '"Johny\nIs\nHere"')

to csv format

out = '"John", """n""", """ABC 123\\nDEF, 456\\nijKL\\r\\n""", """Johny\\nIs\\nHere"""'

or even omitting the special chars at the end

out = '"John", """n""", """ABC 123\\nDEF, 456\\nijKL""", """Johny\\nIs\\nHere"""'

I came up with this monster

out1 = ','.join(f'""{t}""' if t.startswith('"') and t.endswith('"')
                           else f'"{t}"' for t in text)
out2 = out1.replace('\n', '\\n').replace('\r', '\\r')

Solution

  • You can get pretty close to what you want with the csv and io modules from the standard library:

    • use csv to correctly encode the delimiters and handle the quoting rules; it only writes to a file handle
    • use io.StringIO for that file handle to get the resulting CSV as a string
    import csv
    import io
    
    f = io.StringIO()
    
    text = ("John", '"n"', '"ABC 123\nDEF, 456GH\nijKl"\r\n', '"Johny\nIs\nHere"')
    
    writer = csv.writer(f)
    writer.writerow(text)
    
    csv_str = f.getvalue()
    csv_repr = repr(csv_str)
    
    print("CSV_STR")
    print("=======")
    print(csv_str)
    
    print("CSV_REPR")
    print("========")
    print(csv_repr)
    

    and that prints:

    CSV_STR
    =======
    John,"""n""","""ABC 123
    DEF, 456GH
    ijKl""
    ","""Johny
    Is
    Here"""
    
    CSV_REPR
    ========
    'John,"""n""","""ABC 123\nDEF, 456GH\nijKl""\r\n","""Johny\nIs\nHere"""\r\n'
    
    • csv_str is what you'd see in a file if you wrote directly to a file you opened for writing, it is true CSV

    • csv_repr is kinda what you asked for when you showed us out, but not quite. Your example included "doubly escaped" newlines \\n and carriage returns \\r\\n. CSV doesn't need to escape those characters any more because the entire field is quoted. If you need that, you'll need to do it yourself with something like:

      csv_repr.replace(r"\r", r"\\r").replace(r"\n", r"\\n")
      

      but again, that's not necessary for valid CSV.

      Also, I don't know how to make the writer include an initial space before every field after the first field, like the spaces you show between "John" and "n" and then after "n" in:

      out = 'John, """n""", ...'
      

      The reader can be configured to expect and ignore an initial space, with Dialect.skipinitialspace, but I don't see any options for the writer.