I'm writing a program to read and process WAV files for a digital signal processing class project, and I have two test files. I can read the RIFF
, fmt
, and data
chunks properly. Both files have fmt Chunk Size: 16
, but File B has this stray chunk of hex between the fmt
and data
chunks.
I'm certain it's not random data. I speculated it has some metadata about the file, so I converted its song title Colors
to hex and found 43 6f 6c 6f 72 73
is within that stray chunk. I feel this is not a coincidence. All the sites I've visited only mention about a 2-byte variable that tells the size of extra parameters at the end of the fmt
chunk. This can't be the case for file B if both fmt
chunks claim to only have 16 bytes
.
I'm speculating that there are other chunks present in file B. I haven't found anything about these optional(?) chunks. I need help to know what other sub-chunks I can look for in a wav file. I simply don't know the tags of other chunks that can be present in a WAV file
File A ("i ran so far away.wav") contains this header. I downloaded this file from the Internet.
5249 4646 24c0 c900 5741 5645 666d 7420
1000 0000 0100 0100 2256 0000 44ac 0000
0200 1000 6461 7461 00c0 c900
File B ("Colors.wav") contains this header. This is a file I downloaded from a .mp3 to .wav converter.
5249 4646 7c32 4a02 5741 5645 666d 7420
1000 0000 0100 0200 44ac 0000 10b1 0200
0400 1000 4c49 5354 5000 0000 494e 464f
4941 5254 0500 0000 466c 6f77 0000 494e
414d 0700 0000 436f 6c6f 7273 0000 4950
5244 0f00 0000 436f 6465 2047 6561 7373
204f 5031 0000 4953 4654 0e00 0000 4c61
7666 3537 2e32 362e 3130 3000 6461 7461
0032 4a02
If it's helpful, below is output from the program I wrote.
File A
File Descriptor: RIFF
RIFF Chunk Size: 13221924
File Format: WAVE
fmt Chunk Descriptor: fmt
fmt Chunk Size: 16
Audio Format: 1
Number of Channels: 1
Sampling Rate: 22050
Byte Rate: 44100
Block Align: 2
Bits Per Sample: 16
Data Chunk Descriptor: data
Data Chunk Size: 13221888
File B
File Descriptor: RIFF
RIFF Chunk Size: 38417020
File Format: WAVE
fmt Chunk Descriptor: fmt
fmt Chunk Size: 16
Audio Format: 1
Number of Channels: 2
Sampling Rate: 44100
Byte Rate: 176400
Block Align: 4
Bits Per Sample: 16
Data Chunk Descriptor: data
Data Chunk Size: 38416896
The RIFF file specification allows for any chunk id a program wants with the caveat that it might conflict with another program if the same chunk id is used for a different purpose. When writing a program to deal with RIFF files it is NOT required that you be able to understand every chunk type because that would be impossible You must, however, write your reader in such a way that it is able to skip over the unrecognized chunk ids.
The file you are looking at has a predefined and optional 'INFO' chunk in it. If you dump the ascii out from the hex you posted you'll find:
INFO
IART Flow
INAM Colors
IPRD Code Geass OP1
ISFT Lavf57.26.100
This chunk id is covered in the wikipedia page for RIFF - https://en.wikipedia.org/wiki/Resource_Interchange_File_Format#Use_of_the_INFO_chunk
or here http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
and it's also covered in the RIFF specification. Sorry I don't have a link.