Search code examples
pythondistutilsshlex

What is the difference between distutils.util.split_quoted and shlex.split


The python standard library provides distutils.util.split_quoted and shlex.split.

Is there any situation in which distutils.util.split_quoted(s) gives a different result to shlex.split(s)?


Solution

  • Yes. These algorithms disagree about the definition of whitespace: shlex hardcodes the four characters ' \t\r\n', however distutils uses string.whitespace in a regex. Therefore, it additionally considers some other characters as separators.

    formfeed:

    >>> distutils.util.split_quoted('A\fB')
    ['A', 'B']
    >>> shlex.split('A\fB')
    ['A\x0cB']
    

    vertical tab:

    >>> distutils.util.split_quoted('A\vB')
    ['A', 'B']
    >>> shlex.split('A\vB')
    ['A\x0bB']