Search code examples
pythonpython-2.7character-encodingpython-multiprocessing

How to get pool.py to accept non ascii characters?


I am using Python 2.7.18 The idea is to use python to gather songs from specified directories, then create and run the commands to run them through a bunch of converters and sound processors. Some of my songs have characters with accents and any song with a ? in the title gets changed to a ¿ (Inverted Question Mark) in the file name.

My convert_song function works correctly when ran, but when I try to run it in a Pool and the file name or directory has a non ascii character in it, it fails with:

    Traceback (most recent call last):
  File "C:\StreamLine.py", line 270, in <module>
    result = pool.map(convert_song, qTheStack)
  File "C:\Python27\lib\multiprocessing\pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "C:\Python27\lib\multiprocessing\pool.py", line 572, in get
    raise self._value
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbf' in position 27: ordinal not in range(128)

Here's my main where I set up the pool:

if __name__ == '__main__':
    print('Reading artists.')
    predir = 'G:\\Vault\\The Music\\'
    artistfile = open('C:\\Controls\\ArtistList.txt', 'r')
    artistlist = artistfile.readlines()
    dirs = []
    for artist in artistlist:
        dirs.append(predir + artist.strip())
    qTheStack = []
    for currentPath in dirs:
        for wFile in generate_next_file(currentPath):
            print(repr(wFile))
            #print(convert_song(wFile))
            qTheStack.append(wFile)
    print('List loaded.')
    pool = Pool(12)
    result = pool.map(convert_song, qTheStack)
    for item in result:
        print(item)

The print(repr(wFile)) looks like this when ran:

'G:\\Vault\\The Music\\Chicago\\1989 - Greatest Hits 1982-1989\\04 - Will You Still Love Me\xbf.flac'
'G:\\Vault\\The Music\\Chicago\\1989 - Greatest Hits 1982-1989\\06 - What Kind of Man Would I Be\xbf [Remix].flac'

How can I get the built-in Pool from multiprocessing to accept my input?


Solution

  • Change to Python 3, dude.

    As much as I wanted there to be an answer that stayed on Python 2.7, I tried Python 3 and it didn't disappoint. I did have to go back through the obscure steps I found to generate a file that will run a COM/DLL in Python, and I had to remove all the str.decode and encode calls throughout my script. After only one import change, I hit run and it ran as expected.