Search code examples
cmp3id3v2async-safe

Reading the MP3 IDV2 tag size


I am trying to read the size of a ID3V2 tag. my code is supposed to be storing the the first header which contains the identification, version, flags and size in this struct. The code freads from bit 0 to bit 9 and stores it here

typedef struct
{
   uint32_t id:24; //"ID3"
   uint16_t version; // $04 00
   uint8_t flags; // %abcd0000
   uint32_t size; //4 * %0xxxxxxx
}__attribute__((__packed__))
ID3TAG;

reads:

fread(tag, sizeof(ID3TAG), 1, media);

then passes the value of tag.size to this function which unsyncsafe the bits of the size:

int unsynchsafe(uint32_t in)
{
    int out = 0, mask = 0x7F000000;

    while (mask) {
        out >>= 1;
        out |= in & mask;
        mask >>= 8;
    }

    return out;
}

However the returned value of synchsafe can never be the correct size of just the header. I got 248627840. I double checked using exif tool and it was not correct. I would really appreciate any kind of help


Solution

  • The problem that you are having has to do with endianness. I am assuming that you are working on an x86 system, or on another system that is little-endian. The ID3 documentation states that:

    The byteorder in multibyte numbers is most significant byte first (e.g. $12345678 would be encoded $12 34 56 78).

    So the size is stored as a big-endian number in the file. After you read the bytes of the file into your struct, you need to convert this byteorder to little-endian before stripping out the four zero-bits to obtain the final 28 bit representation of size. This is also why you had to compare tag->id with 0x334449 instead of 0x494433-- the bytes stored in tag->id were accessed as a multibyte value, and interpreted in little-endian order.

    Here are the changes I made to make this work. I changed your struct a little, using arrays of uint8_t to get the correct number of bytes. I also used memcmp() to validate tag->id. I made liberal use of unsigned and unsigned long types, to avoid bit-shifting woes. The conversion to little-endian is primitive, and assumes 8 bit bytes.

    This is the entire file that you linked to in the first post, with my changes. I changed the mp3 file to something I had that I could test on.

    #include <stdint.h>
    #include <stdio.h>
    #include <string.h>  // for memcmp()
    
    /**
     ** TAG is always present at the beggining of a ID3V2 MP3 file 
     ** Constant size 10 bytes
     **/
    
    typedef struct
    {
        uint8_t id[3];       //"ID3"
        uint8_t version[2];  // $04 00
        uint8_t flags;       // %abcd0000
        uint32_t size;        //4 * %0xxxxxxx
    }__attribute__((__packed__))
    ID3TAG;
    
    unsigned int unsynchsafe(uint32_t be_in)
    {
        unsigned int out = 0ul, mask = 0x7F000000ul;
        unsigned int in = 0ul;
    
        /* be_in is now big endian */
        /* convert to little endian */
        in = ((be_in >> 24) | ((be_in >> 8) & 0xFF00ul) |
              ((be_in << 8) & 0xFF0000ul) | (be_in << 24));
    
        while (mask) {
            out >>= 1;
            out |= (in & mask);
            mask >>= 8;
        }
    
        return out;
    }
    
    /**
     ** Makes sure the file is supported and return the correct size
     **/
    int mp3Header(FILE* media, ID3TAG* tag)
    {
        unsigned int tag_size;
    
        fread(tag, sizeof(ID3TAG), 1, media);
    
        if(memcmp ((tag->id), "ID3", 3))
        {
            return -1;
        }
    
        tag_size = unsynchsafe(tag->size);
        printf("tag_size = %u\n", tag_size);
    
        return 0;   
    }
    
    // main function
    int main(void)
    {
        // opens the file
        FILE* media = fopen("cognicast-049-carin-meier.mp3", "r");
    
        //checks if the file exists
        if(media == NULL)
        {
            printf("Couldn't read file\n");
            return -1;
        } 
    
        ID3TAG mp3_tag;
        // check for the format of the file
        if(mp3Header(media, &mp3_tag) != 0)
        {
            printf("Unsupported File Format\n");
            fclose(media);
            return -2;      
        }
        fclose(media);
    
        return 0;
    }
    

    Incidentally, there is already a function in the C Standard Library that does this conversion. ntohl() is in the netinet/in.h header file, and it converts a uint32_t number from network byte order (which is big-endian) to host byte order. If your system is big-endian, the function returns the input value unchanged. But if your system is little-endian, the input is converted to a little-endian representation. This is useful for passing data between computers using different byte-ordering conventions. There are also the related functions htonl(), htons(), and ntohs().

    The above code can be changed (for the better) to use ntohl() by replacing my primitive conversion code with:

    #include <netinet/in.h>  // for ntohl()
    ...
    /* convert to host-byte-order (little-endian for x86) */
    in = ntohl(be_in);