Background: I am using JLayer to play an MP3
file. I am attempting to analyze the varying amplitude/audio levels in the MP3
. With my analysis, I would like to determine the duration of the silence at the beginning and end of the MP3
. In addition, as the MP3
is being played, I would like a graph to display the audio level (like a visual soundwave).
Problem: For effective analysis, I need to be able to analyze raw PCM
data. Currently, I am analyzing the byte[] retrieved through AudioInputStream
and sent to SourceDataLine
. PCM
is short[] not byte[], which means I am not getting the full data.
I am using Root-Mean Square
(RMS
) to determine volume levels.
The playback code where the byte[] is processed:
AudioInputStream in = null;
AudioFile af = null; //Custom class which holds some data about mp3.
SourceDataLine line = null;
// Set current audio file.
af = musicPlaylist.get(0);
line = (SourceDataLine) AudioSystem.getLine(af.getLineInfo());
line.open(af.getAudioFormat());
line.start();
in = getAudioInputStream(af.getAudioFormat(), af.getAudioStream());
int bR = playbackBufferSize;
final byte[] buffer = new byte[bR];
int n = 0;
while (playMedia) {
if ((n = in.read(buffer, 0, buffer.length)) == -1) {
break;
}
if (line != null) {
line.write(buffer, 0, n);
int amp = (int) Math
.ceil((rmsAudioLevel(decode(buffer)) / 32767) * 100);
mainScreen.setAmpDisplayLevel(amp, String.valueOf(amp));
mainScreen.updateGraph(amp);
}
}
Essentially: How do I decode the PCM
data on-the-spot as I play the MP3
, so that I may show volume levels and therefore detect silence?
First off, you ARE getting all the PCM data in buffer[]. But you probably have to assemble the bytes into PCM data. Your audio format will tell you how many bits encoding is being used. Most common is 16-bit, but sometimes 24- or 32-bit data shows up. With 16-bit data, you append two contiguous bytes to build a short. The order of the two bytes depends on whether the format is little-endian or big-endian. I am noticing on the right of this screen, in the "Related" column, is a link: how to get PCM data from a wav file--that link or another similar should get you an example of the code you will need.
Second issue, I don't think doing RMS on separate buffer[] arrays is exactly correct. I could be wrong on this. I'm thinking its more like a moving average, where some of the data from the beginning of one buffer[] should include some of the data from the end of the previous buffer[]. Does the formula require that you "go back" or "average over" N number of frames? If so, you will want to keep the previous buffer[] handy for situations where the N amount spans two frames. And you will be iterating through the current buffer[], one "frame" at a time (or handing buffer[] to a subroutine that in effect does this).