Search code examples
pythonpandaspython-click

Python click argument to define delimiter causes CSV error "delimiter" must be a 1-character string


I am trying to build a simple click command line application to read in a file with one type of delimiter and write out the same file with a different delimiter. I don't want to do something like find-and-replace, as there are possibly some properly escaped delimiters inside of columns that I do not want to touch.

I wrote a simple click-based CLI to do this, but I'm having some problems passing in the \t to create a tab-delimited file.

As seen by the error below, the tab delimiter is not getting properly passing into the pandas function to write out the new file. Everything looks right when I print out the delimiters in the middle of the CLI, so I'm not sure what's going on here.

import click
import pandas as pd

@click.command()
@click.argument('filename')
@click.argument('in_delimiter')
@click.argument('out_delimiter')
def cli(filename, in_delimiter, out_delimiter):

    """
    Command line interface to change file delimiters
    """

    # read in CSV file
    df = pd.read_csv(filename, sep=in_delimiter)
    print(len(df))

    # write out CSV file
    df.to_csv('output.csv', sep=out_delimiter, index=False)
    print("transformation complete")


if __name__ == '__main__':
    cli()

This is how I'm passing my input and output delimiters into the CLI:

python cli.py data.csv "," "\t"

This is the error that is generated:

Traceback (most recent call last):
  File "cli.py", line 24, in <module>
    cli()
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "cli.py", line 19, in cli
    df.to_csv('output.csv', sep=out_delimiter, index=False)
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/core/frame.py", line 1745, in to_csv
    formatter.save()
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/io/formats/csvs.py", line 169, in save
    self.writer = UnicodeWriter(f, **writer_kwargs)
  File "/home/curtis/Program_Files/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/io/common.py", line 521, in UnicodeWriter
    return csv.writer(f, dialect=dialect, **kwds)
TypeError: "delimiter" must be a 1-character string

Solution

  • To process the escaped characters, you can use a callback like this:

    ###Code:

    import codecs
    
    def unescape(ctx, param, value):
        return codecs.getdecoder("unicode_escape")(value)[0]
    

    To use the callback you can do:

    @click.argument('escaped', callback=unescape)
    

    ###How does this work

    This will process the passed in string using the unicode_escape codec.

    (Source)

    ###Test Code:

    import click
    
    @click.command()
    @click.argument('escaped', callback=unescape)
    def cli(escaped):
        click.echo('len: {}, ord: {}'.format(len(escaped), ord(escaped)))
    
    
    if __name__ == "__main__":
        commands = (
            r'\t',
            r'\n',
            '\t',
            ',',
            '--help',
        )
    
        import sys, time
    
        time.sleep(1)
        print('Click Version: {}'.format(click.__version__))
        print('Python Version: {}'.format(sys.version))
        for cmd in commands:
            try:
                time.sleep(0.1)
                print('-----------')
                print('> ' + cmd)
                time.sleep(0.1)
                cli(cmd.split())
    
            except BaseException as exc:
                if str(exc) != '0' and \
                        not isinstance(exc, (click.ClickException, SystemExit)):
                    raise
    

    ###Results:

    Click Version: 6.7
    Python Version: 3.6.3 (v3.6.3:2c5fed8, Oct  3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)]
    -----------
    > \t
    len: 1, ord: 9
    -----------
    > \n
    len: 1, ord: 10
    -----------
    >   
    Usage: test.py [OPTIONS] ESCAPED
    
    Error: Missing argument "escaped".
    -----------
    > ,
    len: 1, ord: 44
    -----------
    > --help
    Usage: test.py [OPTIONS] ESCAPED
    
    Options:
      --help  Show this message and exit.