Search code examples
javaarraysbytemp3decoding

How can I make an array of frequencies accurately depict a decoded mp3 file?


I am using mp3spi and Triton, and this code will handle exclusively 192kbps mp3 files. The problem I am facing is that the first second of hz is mostly made up of exclusively:

0,0,0,0 or 255,255,255,255

I do believe I might not be skipping the header correct, in which case the frequencies are not a true depiction of the mp3 at that specific ms. Does anyone see anything wrong with the way im skipping the header, or how im adding up the bytes to the array?

In other words, I want it so the array at position [0] is equal to the mp3 at position 00:00:00, and the array at position [44100] is equal to the song at exactly 1 second in.

This is the code I use for reading the bytes from the mp3 file, adding it to the arraylist bytes.

import javax.sound.sampled.*;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;


public class ReadMP3 {


private ArrayList<Integer> bytes = new ArrayList<>();
private AudioFormat decodedFormat;

public ReadMP3() throws UnsupportedAudioFileException, IOException {

    String filename = new ReadFiles().getFile();
    File file = new File(filename);
    AudioInputStream in = AudioSystem.getAudioInputStream(file);
    AudioInputStream din = null;
    AudioFormat baseFormat = in.getFormat();
    AudioFormat decodedFormat = new 
    AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
            baseFormat.getSampleRate(),
            16,
            baseFormat.getChannels(),
            baseFormat.getChannels() * 2,
            baseFormat.getSampleRate(),
            false);
    din = AudioSystem.getAudioInputStream(decodedFormat, in);
    this.decodedFormat = decodedFormat;

    int i = 0;
    while(true){
        int currentByte = din.read();
        if (currentByte == -1) {break;}
        bytes.add(i, currentByte);
        i++;
    }
    din.close();
    in.close();
}

This is the second part of my code, where I add 4 bytes to each index of the array, such that array.length / 44100 is equal to the length of the song in seconds. This implies that each array[i][4] is equal to 1hz. and array[0][4] up to array[44100][4] is the first second of the song.

public class AnalyzeMP3 {


//adds 4 bytes to offset[i], where each i represents 1hz, 
//and 44100hz=1sec

public static int[][] calculate(ReadMP3 mp3) {

    //calculates and prints how long the song is
    double seconds = mp3.getBytes().size() / 
    mp3.getDecodedFormat().getFrameRate() / 4;
    System.out.println("Length of song: " + (int)seconds + "s");

    //adds 4 values to i through the whole song
    int[][] offset  = new int[mp3.getBytes().size()/4][4];
    for(int i = 0; i < mp3.getBytes().size()/4; i++) {
        for(int j = 0; j < 4; j++) {
            offset[i][j] = mp3.getBytes().get(i+j);
        }
    }

    return offset;
}

}

Solution

  • Thanks Brad and VC.One for making me realize my own mistakes. To begin with I had to add the correct values to the PCM-signed encoding like this:

    AudioFormat decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
                (float)44.1,       //samplerate
                16,                //sampleSizeInBits
                2,                 //channels
                626,               //frameSize
                (float)38.4615385, //frameRate
                false);            //bigEndian
    

    Then I needed to accurately represent the 2 channels in an array. How I did it above in the class AnalyzeMP3 is wrong, and it should be added like this:

        //adds 4 values to i through the whole song
        int[][] offset  = new int[mp3.getBytes().size()/4][4];
        int counter = 0;
        for(int i = 0; i < mp3.getBytes().size()/4;i++) {
            for(int j = 0; j < 4; j++) {
                offset[i][j] = mp3.getBytes().get(counter);
                counter++;
            }
    
        }
    

    After making these changes the array is 4351104 in size. 4351104 / 44100 is equal to the song length in seconds. And there is no header or anything I have to skip, the array is now an accurate representation of the whole song with 44100 frequencies each second. Which can easily be transformed to represent 10ms as 441 frequencies, etc.