I've never been able to understand how audio data is stored. However, I'd like to know a way to find the pitch of PCM data. Let's say, for example, that I recorded a single key being striked on a piano, in 16-bit mono PCM format at a given sample rate. How could I find the frequency, in hertz, of the audio? Simple code to get the average frequency works for me, but a more detailed explanation of how to better understand the format would be ideal.
Thanks!
PCM audio is not stored as a series of pitches. To figure that up, you need a Fast Fourier Transform, or FFT. See https://stackoverflow.com/search?q=pitch+detection, there are 10s of posts about this already.
Think of a audio waveform. PCM encoding is simply sampling that wave a certain number of times per second, and using a specific number of bits per sample.
Image from http://en.wikipedia.org/wiki/Pulse-code_modulation
16-bit Mono PCM at 44.1kHz means that 44,100 times per second, a 16-bit value (2 bytes) will be stored that represents the waveform at the specific time the sample was taken. 44.1kHz is fast enough to store frequencies that approach 22kHz (see Nyquist Frequency).
FFT turns those samples from the time domain to the frequency domain. That is, you can find what the levels of all the frequencies are for a particular period of time. The more bands you look at, the more computational intensive it is.