Search code examples
cencodingbinarymp3id3

How to check or change encoding in USLT (ID3)?


It's not a duplicate. I've read em all.

I have a Nokia-N8-00. It's music player supports USLT (UnSynchronised Lyrics/Text). I use a tool called spotdl (https://github.com/Ritiek/Spotify-Downloader) that fetches song titles from "spotify" and downloads them from other sources (generally youtube) and merges metadata as well.

The problem is then, the music downloaded by that tool have lyrics on all my devices except N8. Fortunately, I got a music that had embedded lyrics that is supported on my phone too. I then analyzed both the files and found that in binary sequence, they have a very little difference (just for USLT section but they are different songs). The differences are :-

The one that supports :

55 53 4C 54 00 00 0A 56 00 00 03 58 58 58

The one that doesn't :

55 53 4C 54 00 00 07 38 00 00 01 58 58 58

(These sequences are for "USLT" declaration in the file)

I think it's an encoding difference. If I am right, what encoding is present and in which one? If it's not encoding, what is it?

I know these sequences can't elaborate the situation. So, here are the files I'm trying https://github.com/gaurav712/music.

I don't need supported USLT, I am just curious about it as I wanna make an implementation of it in C (I don't need language specific help though).


Solution

  • Here is what I got:

    55 53 4C 54
    

    Translates to:

    USLT
    

    So we got that right. Now, I believe we can merge that result with this answer:

    Frame ID       $xx xx xx xx (four characters)
    Size           $xx xx xx xx
    Flags          $xx xx
    Encoding       $xx
    Text
    

    (Taken from: ID3v2 Specification)

    (or see this: https://web.archive.org/web/20161022105303/http://id3.org/id3v2-chapters-1.0)

    Now, I couldn't get this from the source (because the site was down) but there is also this:

    Encoding flag explanation:
    • $00 ISO-8859-1 [ISO-8859-1]
    • $01 UTF-16 [UTF-16]
    • $02 UTF-16BE [UTF-16]
    • $03 UTF-8 [UTF-8]
    

    So, according to these findings (which I'm not too sure about), the one that is supported is UTF-8 encoded and the one not supported is UTF-16.

    EDIT

    I've downloaded and viewed your mp3 files for further inspection. Here are my new findings:

    First of all, we were correct about the encodings:

    UTF-8 is on supported: enter image description here

    UTF-16 is on unsupported: enter image description here

    Does this mean you can just turn '01' into '03' and it'll magically work? I doubt. It depends on the driver. What if the driver sees '\x00' bytes and iterprets it as end of string (as in end of USLT payload). To test this, you can try manually converting the encoding on the file (by removing extra bytes).

    Secondly, running eyeD3 on linux on both files, I recovered that:

    supported.mp3   -> ID3 v2.4
    unsupported.mp3 -> ID3 v2.3
    

    Perhaps that's an issue?

    Also, note that the location of USLT tag in both files are different:

    supported.mp3:

    enter image description here

    unsupported.mp3:

    enter image description here

    On linux, there are further tools to give you extra information, if need be:

    mp3info, id3info, id3tool, exiftool, eyeD3, lltag
    

    Are a couple examples. However, I think the main problem is in the text encoding. I was able to recover the lyrics quite fine using the above tools. But some of the tools give different answers because of ID3 version being different and so on.