java android multithreading audio audiotrack

Android: Audio Track (Stream-Mode) cuts between write

My main goal is to be able to stream audio from one device to another device in the LAN. I plan doing this by reading the mp3-file into a byte[] (which I already got working) and send it as udp-packet to the 2nd device and play it there (I'm telling you this in case this is already the wrong approach). I'm currently stuck with playing my byte-Arrays. I read my file with the decoder(path, startMs, durationMs) function from mp3. At the moment I am able to hear the audio but after every tick (which are the portions in which I read the file) I hear a nothing for some ms, which leads to a bad listening expierence. I thought this has to do with the Buffer-Size and tried playing around a bit with it, but this didn't really change something, as well as adding AudioTrack.WRITE_NON_BLOCKING. I also thought about putting every for()-loop in a different thread, but this doesn't work at all (which makes sense). I also already tried reading the file first and putting my byte[] in an Arraylist as this might be an issue cause by slow-file-reading, but still the same experience. It might also help to know that Log.e("DEBUG", "Length " + data.length); is only shown every tick, which means writing also only happens every tick (which proably is the issue). How can I get rid of these empty parts in my song?

Here is my code executed when you click the button:

song.setOnClickListener(new View.OnClickListener() {
                @Override
                public void onClick(View view) {
                    Thread thrd = new Thread(new Runnable() {
                        @Override
                        public void run() {
                            try {
                                int tick = 1000;
                                int max = 9000;
                                int sampleRate = 44100;
                                int bufSize = AudioTrack.getMinBufferSize(sampleRate*4, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT);
                                byte[] data = decode(path, 0, tick);
                                AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC,
                                        44100, AudioFormat.CHANNEL_OUT_STEREO,
                                        AudioFormat.ENCODING_PCM_16BIT, bufSize,
                                        AudioTrack.MODE_STREAM, AudioTrack.WRITE_NON_BLOCKING);
                                track.play();
                                track.write(data, 0, data.length);
                                Log.e("DEBUG", "Length " + data.length);
                                for(int i = tick; i < max; i+=tick) {
                                    data = decode(path, i, tick);
                                    track.write(data, 0, data.length);
                                    Log.e("DEBUG", "Length " + data.length);
                                }
                            } catch (IOException e) {
                                e.printStackTrace();
                            }
                        }
                    });
                    thrd.start();
                }
            });

My decode()-function (based on this tutorial) with JLayer 1.0.1:

public static byte[] decode(String path, int startMs, int maxMs)
            throws IOException {
        ByteArrayOutputStream outStream = new ByteArrayOutputStream(1024);

        float totalMs = 0;
        boolean seeking = true;

        File file = new File(path);
        InputStream inputStream = new BufferedInputStream(new FileInputStream(file), 8 * 1024);
        try {
            Bitstream bitstream = new Bitstream(inputStream);
            Decoder decoder = new Decoder();

            boolean done = false;
            while (! done) {
                Header frameHeader = bitstream.readFrame();
                if (frameHeader == null) {
                    done = true;
                } else {
                    totalMs += frameHeader.ms_per_frame();

                    if (totalMs >= startMs) {
                        seeking = false;
                    }

                    if (!seeking) {
                        SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);

                        if (output.getSampleFrequency() != 44100
                                || output.getChannelCount() != 2) {
                            Log.w("ERROR", "mono or non-44100 MP3 not supported");
                        }

                        short[] pcm = output.getBuffer();
                        for (short s : pcm) {
                            outStream.write(s & 0xff);
                            outStream.write((s >> 8) & 0xff);
                        }
                    }

                    if (totalMs >= (startMs + maxMs)) {
                        done = true;
                    }
                }
                bitstream.closeFrame();
            }
        } catch (BitstreamException e) {
            throw new IOException("Bitstream error: " + e);
        } catch (DecoderException e) {
            Log.w("ERROR", "Decoder error", e);
        } finally {
            inputStream.close();
        }
        return outStream.toByteArray();
    }

I don't think the decode-function is the problem, as the byte[] returned seems to be quite good. Maybe the reading process could be optimized as later, when I really stream the audio and read always about 10ms parts, always opening and closing the file might be an issue.

Solution

The root cause of this turned out to be: You were using the decode() function in a way for which it wasn't specifically designed. Even though it appears decode() will let you decode any portion of the .mp3 stream in a random-access way, in practice the first few ms of the returned audio are always silence, whether you're starting at the beginning of the song, or in the middle. This silence was causing the "gaps". Apparently the decode() function was intended more for re-starting play at a random location, for instance due to a user "seek".

decode() behaves that way because, in order to decode the Nth block of compressed data, the decoder needs both block N-1 and block N. The decompressed data that corresponds to block N will be good, but the data for block N-1 will have this "fade in" sound. This is a general feature of .mp3 decoders, and I know it happens for AAC as well. Meanwhile, decoding block N+1, N+2, N+3, etc., is no problem, because in each case, the decoder already has the previous block.

One solution is to change the decode() function:

private Decoder decoder;
private float totalMs;
private Bitstream bitstream;
private InputStream inputStream;

//call this once, when it is time to start a new song:
private void startNewSong(String path) throws IOException
{
    decoder = new Decoder();
    totalMs = 0;
    File file = new File(path);
    inputStream = new BufferedInputStream(new FileInputStream(file), 8 * 1024);
    bitstream = new Bitstream(inputStream);
}

private byte[] decode(String path, int startMs, int maxMs)
        throws IOException {
    ByteArrayOutputStream outStream = new ByteArrayOutputStream(1024);


    try {
        boolean done = false;
        while (! done) {
            Header frameHeader = bitstream.readFrame();
            if (frameHeader == null) {
                done = true;
                inputStream.close();   //Note this change. Now, the song is done. You can also clean up the decoder here.
            } else {
                totalMs += frameHeader.ms_per_frame();

                SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);

                if (output.getSampleFrequency() != 44100
                        || output.getChannelCount() != 2) {
                    Log.w("ERROR", "mono or non-44100 MP3 not supported");
                }

                short[] pcm = output.getBuffer();
                for (short s : pcm) {
                    outStream.write(s & 0xff);
                    outStream.write((s >> 8) & 0xff);
                }

                if (totalMs >= (startMs + maxMs)) {
                    done = true;
                }
            }
            bitstream.closeFrame();
        }
    } catch (BitstreamException e) {
        throw new IOException("Bitstream error: " + e);
    } catch (DecoderException e) {
        Log.w("ERROR", "Decoder error", e);
    }
    return outStream.toByteArray();
}

That code is a bit rough & ready, and it could use some improvement. But the general approach is, instead of random access, decode() uses an FSM to decode a little more of the song each time; read a little bit more of the file, and send a few more chunks to the decoder. Because the decoder (and bitstream) state is preserved between each call to decode(), there is no need to go hunting for block N-1.

UDP and Streaming

The validity of your UDP approach depends on a lot of things. You may want to look for other questions that address that in particular. UDP is convenient for broadcasting to multiple devices on a given subnet, but it will not help you ensure packets are received in order, or at all. You may want TCP instead. Also consider whether you want to transmit the encoded .mp3 blocks (those returned by bitstream.readFrame()), or blocks of decompressed audio. Also think about how you will deal with network latency, dropped connections, and buffering. There are many difficult design decisions to make here, and each choice has pros and cons. Good luck.