Search code examples
pythonbashmp3m4a

abstracting the conversion between id3 tags, m4a tags, flac tags


I'm looking for a resource in python or bash that will make it easy to take, for example, mp3 file X and m4a file Y and say "copy X's tags to Y".

Python's "mutagen" module is great for manupulating tags in general, but there's no abstract concept of "artist field" that spans different types of tag; I want a library that handles all the fiddly bits and knows fieldname equivalences. For things not all tag systems can express, I'm okay with information being lost or best-guessed.

(Use case: I encode lossless files to mp3, then go use the mp3s for listening. Every month or so, I want to be able to update the 'master' lossless files with whatever tag changes I've made to the mp3s. I'm tired of stubbing my toes on implementation differences among formats.)


Solution

  • I needed this exact thing, and I, too, realized quickly that mutagen is not a distant enough abstraction to do this kind of thing. Fortunately, the authors of mutagen needed it for their media player QuodLibet.

    I had to dig through the QuodLibet source to find out how to use it, but once I understood it, I wrote a utility called sequitur which is intended to be a command line equivalent to ExFalso (QuodLibet's tagging component). It uses this abstraction mechanism and provides some added abstraction and functionality.

    If you want to check out the source, here's a link to the latest tarball. The package is actually a set of three command line scripts and a module for interfacing with QL. If you want to install the whole thing, you can use:

    easy_install QLCLI
    

    One thing to keep in mind about exfalso/quodlibet (and consequently sequitur) is that they actually implement audio metadata properly, which means that all tags support multiple values (unless the file type prohibits it, which there aren't many that do). So, doing something like:

    print qllib.AudioFile('foo.mp3')['artist']
    

    Will not output a single string, but will output a list of strings like:

    [u'The First Artist', u'The Second Artist']
    

    The way you might use it to copy tags would be something like:

    import os.path
    import qllib  # this is the module that comes with QLCLI
    
    def update_tags(mp3_fn, flac_fn):
        mp3 = qllib.AudioFile(mp3_fn)
        flac = qllib.AudioFile(flac_fn)
        # you can iterate over the tag names
        # they will be the same for all file types
        for tag_name in mp3:
            flac[tag_name] = mp3[tag_name]
        flac.write()
    
    mp3_filenames = ['foo.mp3', 'bar.mp3', 'baz.mp3']
    
    for mp3_fn in mp3_filenames:
        flac_fn = os.path.splitext(mp3_fn)[0] + '.flac'
        if os.path.getmtime(mp3_fn) != os.path.getmtime(flac_fn):
            update_tags(mp3_fn, flac_fn)