Search code examples
androidencoding32-bitpcm

AudioRecord & AudioTrack do not seem to work with 32-bit encoding


I am currently working on a project where I try to stream audio from one device to another in real-time using websockets. For this I am trying to implement a 'cross platform' implementation that works with browsers, android and ios. My goal is to record and play PCM audio in various formats. The PCM produced by the browser (chrome & firefox) has 32-bit encoding, which I try to play on an android phone. Just for reference here is the project.

On android I am recording with AudioRecord and streaming the raw pcm over a websocket to another device. And similarly I play it using AudioTrack. Everything works fine if I use 16-bit encoding, with sample rate 44100Hz and 2 channels. However it does not seem to work with 32-bit encoding. The recording from the browser (32-bit) does not play, even though I did interleave the channels etc. And similarly when I try to record on android using 32-bit encoding, it does not produce any sound as it is not playing anything on the browser.

I tried to play a wav file on android with 32-bit encoding, and it works fine. However, I don't know if the system does down-sampling in the background.

My goal is to avoid down/up-sampling as much as possible, as I want to achieve low-latency.

I could not find any solution online, is this a common issue, am I missing something here?

With 32-bit encoding the result of the write method returns AudioTrack.ERROR_INVALID_OPERATION

int result = audioTrack.write(buffer, 0, buffer.length, AudioTrack.WRITE_BLOCKING);

if(result == AudioTrack.ERROR_BAD_VALUE){
    System.out.println("ERROR: bad value");
}else if (result == AudioTrack.ERROR_DEAD_OBJECT){
    System.out.println("ERROR: dead object");
}else if (result == AudioTrack.ERROR_INVALID_OPERATION){
    System.out.println("ERROR: invalid operation");
}else if (result == AudioTrack.ERROR){   
    System.out.println("ERROR: ??");
}else{
    System.out.println("Successfully written to buffer!");
}

Implementation for recording Audio:

public class AudioStream {
    private AudioStreamMetadata metadata = AudioStreamMetadata.getDefault();
    ...

    public void start() {
        ...
        new Thread(() -> {
            socket.send("started");
            socket.send(metadata.toString());

            while (!hasStopped) {
                float[] data = new float[metadata.getBufferSize()];
                recorder.read(data, 0, data.length, AudioRecord.READ_BLOCKING);
                byte[] output = new byte[data.length * metadata.getBytesPerSample()];
                ByteBuffer.wrap(output).order(ByteOrder.LITTLE_ENDIAN).asFloatBuffer().put(data);
                socket.send(ByteString.of(output));
            }
        }).start();
    }

    private void initRecorder() {
        int min = AudioRecord.getMinBufferSize(metadata.getSampleRate(), metadata.getChannels(true), metadata.getEncoding());
        recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, metadata.getSampleRate(),
                metadata.getChannels(true), metadata.getEncoding(), min);
    }
}

AudioStreamMetadata Class:

public class AudioStreamMetadata {

    public static final int DEFAULT_SAMPLE_RATE = 44100;
    public static final int DEFAULT_CHANNELS = 2;
    public static final int DEFAULT_ENCODING = 32;
    public static final int DEFAULT_BUFFER_SIZE = 6144*4;
    ...

    public AudioStreamMetadata(int sampleRate, int bufferSize, int channels, int encoding) {
        this.sampleRate = sampleRate;
        this.bufferSize = bufferSize;
        this.channels = channels;
        this.encoding = encoding;
        this.bytesPerSample = encoding / 8;
        this.bufferSizeInBytes = bufferSize * bytesPerSample;
    }

    //getters

    public int getChannels(boolean in) {
        if(channels == 1){
            return in? AudioFormat.CHANNEL_IN_MONO : AudioFormat.CHANNEL_OUT_MONO;
        }else if(channels == 2){
            return in? AudioFormat.CHANNEL_IN_STEREO : AudioFormat.CHANNEL_OUT_STEREO;
        }else{
            return 0;
        }
    }

    public int getEncoding() {
        if(encoding == 8){
            return AudioFormat.ENCODING_PCM_8BIT;
        }else if(encoding == 16){
            return AudioFormat.ENCODING_PCM_16BIT;
        }else if(encoding == 32){
            return AudioFormat.ENCODING_PCM_FLOAT;
        }else{
            return 0;
        }
    }

    public static AudioStreamMetadata getDefault(){
        return new AudioStreamMetadata(DEFAULT_SAMPLE_RATE, DEFAULT_BUFFER_SIZE, DEFAULT_CHANNELS, DEFAULT_ENCODING);
    }
}

Solution

  • I assumed that AudioTrack would be able to handle different datatypes in write() since I initialize it with the correct configurations. However, an AudioTrack initialized as 8-bit-encoding accepts only byte, as 16-bit-encoding both byte and short, but an AudioTrack as 32float-bit-encoding accepts only float. I am receiving the data from the socket as a byte[], which I needed to convert to a float[].

    @Override
        public void onMessage(WebSocket webSocket, ByteString bytes) {
            super.onMessage(webSocket, bytes);
    
            byte[] buffer = bytes.toByteArray();
            FloatBuffer fb = ByteBuffer.wrap(buffer).asFloatBuffer();
            float[] out = new float[fb.capacity()];
            fb.get(out);
    
            int result = audioTrack.write(out, 0, out.length, AudioTrack.WRITE_BLOCKING);
    
        }