Search code examples
mp3id3v2

correct coding of ID3 v2.3 frame size field for GEOB tag


I have some confusion regarding how the frame size bytes should be coded/decoded for ID3 v2.3.0. According to the (informal) ID3 v2.3.0 specification, the size of each frame should be coded into 4 bytes, where the most significant bit of each byte is unused. To calculate the size, it would take the formula below:

byte MASK = (byte)0x7F;

int size = 0;

for (int = 0; i < 4; i++) {
   size = size * 128 + (b[i] & MASK);
}

But when I used my parser to parse some MP3 files, quite a few files had GEOB (general encapsulated object tag) frames whose size bytes were coded as if it were a Big Endian 32-bit Integer.

After I fixed these bytes by re-coding them using the proper algorithm, commercial software such as Windows 7 and Winamp were not able proper display the subsequent tags (in several instances, TIT2 was right after GEOB, so the song's title was not displayed although it was in the file).

I also found similar problems for MCDI (music cd identifier), and TALB ('Album/Movie/Show title') tags.

I read through the v2.3 spec, and also Googled, but wasn't able to find any information regarding the use of a 32-bit integer as size metadata for these frames. Yet the common behavior in different commercial software seems to suggest for such fields, a 32-bit integer should be used as size instead of 4 bytes masked by 0x7F.

So I am just wondering if anyone here has worked on ID3 v2.3 and could clarify this for me.


Solution

  • I believe I have found the answer. ID3 v2.3, despite its being the more commonly supported (as opposed to v2.4) has not to well-written (and informal) spec. Its header size uses the 4 0x7F bytes, but the frame sizes are in fact 32-bit integers, only they are never clearly spelled out.

    the reason I usually encountered the problem when dealing with GEOB is because the problem won't crop up until the frame size is larger than 0x7F, and GEOB usually is.