I'am trying to read a mp3 file in c++ and show the id3 information that the file contains. The problem I have is when i read the frame header the size of the content that it holds is wrong. Instead of giving me a integer of 10 bytes it gives me 167772160 bytes. http://id3.org/id3v2.3.0#ID3v2_frame_overview
struct Header {
char tag[3];
char ver;
char rev;
char flags;
uint8_t hSize[4];
};
struct ContentFrame
{
char id[4];
uint32_t contentSize;
char flags[2];
};
int ID3_sync_safe_to_int(uint8_t* sync_safe)
{
uint32_t byte0 = sync_safe[0];
uint32_t byte1 = sync_safe[1];
uint32_t byte2 = sync_safe[2];
uint32_t byte3 = sync_safe[3];
return byte0 << 21 | byte1 << 14 | byte2 << 7 | byte3;
}
const int FRAMESIZE = 10;
The code above is used in order to translate the binary data to ASCCI data. Inside of main
Header header;
ContentFrame contentFrame;
ifstream file(argv[1], fstream::binary);
//Read header
file.read((char*)&header, FRAMESIZE);
//This will print out 699 which is the correct filesize
cout << "Size: " << ID3_sync_safe_to_int(header.hSize) << endl << endl;
//Read frame header
file.read((char*)&contentFrame, FRAMESIZE);
//This should print out the frame size.
cout << "Frame size: " << int(contentFrame.contentSize) << endl;
I have written a program for this task in Perl and it works fine, there unpack is used such as:
my($tag, $ver, $rev, $flags, $size) = unpack("Z3 C C C N"), "header");
my($frameID, $FrameContentSize, $frameFlags) = unpack("Z4 N C2", "content");
sync_safe_to_int is also used in order to get the size of the header correct but for the contet size it is only to print witout any conversion
N An unsigned long (32-bit) in "network" (big-endian) order.
C An unsigned char (octet) value.
Z A null-terminated (ASCIZ) string, will be null padded.
The output from my program:
Header content
Tag: ID3
Ver: 3
Rev: 0
Flags: 0
Size: 699
WRONG Output!
Frame content
ID: TPE1
size: 167772160
Flags:
Correct output from Perl!
Frame content
ID: TPE1
size: 10
Flags: 0
contentFrame.contentSize
is defined as uint32_t
, but printed as (signed)int
.
Also, as the document states multibyte numbers are Big Endian:
The bitorder in ID3v2 is most significant bit first (MSB). The byteorder in multibyte numbers is most significant byte first (e.g. $12345678 would be encoded $12 34 56 78).
No conversion is done for contentFrame.contentSize
however. Those bytes should be reversed too, as in ID3_sync_safe_to_int()
, but this time shifted in multiples of 8 instead of 7 (or use ntohl()
- network-to-host order).
You say that you get 1677772160 instead of 18, but even with manipulation of the bits/bytes for the above, they don't seem to make sense. Are you sure those are the right numbers? On top of your post you have other values:
Instead of giving me a low integear under 100 bytes it gives me around 140000 bytes.
Did you have a look at the bytes in memory after calling file.read((char*)&contentFrame, FRAMESIZE);
? However if your ID shows TPE1
the position should be ok. I just wonder if the numbers you provided are the correct ones, because they don't make sense.
Update with nthol()
conversion:
//Read frame header
file.read((char*)&contentFrame, FRAMESIZE);
uint32_t frame_size = ntohl(contentFrame);
cout << "Frame size: " << frame_size << endl;
ntohl()
will work on LE-systems and on BE-systems (on BE-systems it will simply do nothig).