My goal is to reduce the audio data size generated by the react-native-recording.
The expected result: The audio file size should be between 5200 bytes to 5400 bytes per 1 seconds of audio / Opus Voice bit rate 5.17 KiB with sampling rate 8 kHz.
The actual result:
The audio file size is 3 times the expected result. For example recording this word A guest for Mr. Jerry
/ 1.6 seconds of audio will have a data size roughly 28,000 bytes.
I plan to write custom native module to achieve this goal. If you like, feel free to leave a link for me to read.
My end goal is to send the audio data through WebSocket. I deliberately remove the WebSocket.
Steps to reproduce:
let audioInt16: number[] = [];
let listener;
React.useEffect(() => {
// eslint-disable-next-line react-hooks/exhaustive-deps
listener = Recording.addRecordingEventListener((data: number[]) => {
console.log('record', data.length);
// eslint-disable-next-line react-hooks/exhaustive-deps
audioInt16 = audioInt16.concat(data);
});
return () => {
listener.remove();
};
});
const startAsync = async () => {
await PermissionsAndroid.requestMultiple([
PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
]);
Recording.init({
bufferSize: 4096,
sampleRate: 8000,
bitsPerChannel: 16,
channelsPerFrame: 1,
});
Recording.start();
};
const saveAudio = async () => {
const promise = await RNSaveAudio.saveWav(path, audioInt16);
console.log('save audio', promise, path);
};
const playAudio = () => {
if (player.canPrepare || player.canPlay) {
player.prepare((err) => {
if (err) {
console.log(err);
}
console.log('play audio', player.duration);
player.play();
});
}
};
Update:
I just did the calculations, with sampleRate of 8000, and mono channel with 16-bits (2-bytes) per sample, that's 16,0000 bytes per second, so 1.6 seconds of audio is 25,600 bytes, which is what you're getting.
To get 5200 to 5400 bytes per second, you have to work backwards. Select 8-bit (1-byte) per sample. So you only have 5200 to 5400 sampleRate to work with.
PCM data is raw audio samples. The only way to get higher quality audio into less bytes is by audio compression, into mp3 or aac or some other audio compression codec.
Original reply:
Looks like RNSaveAudio.saveWav has a fixed sampleRate of 44100:
output = new DataOutputStream(new FileOutputStream(path));
// WAVE header
// see http://ccrma.stanford.edu/courses/422/projects/WaveFormat/
writeString(output, "RIFF"); // chunk id
writeInt(output, 36 + data.length); // chunk size
writeString(output, "WAVE"); // format
writeString(output, "fmt "); // subchunk 1 id
writeInt(output, 16); // subchunk 1 size
writeShort(output, (short) 1); // audio format (1 = PCM)
writeShort(output, (short) 1); // number of channels
writeInt(output, 44100); // sample rate
writeInt(output, 44100 * 2); // byte rate
writeShort(output, (short) 2); // block align
writeShort(output, (short) 16); // bits per sample
writeString(output, "data"); // subchunk 2 id
writeInt(output, data.length); // subchunk 2 size
You can simply write your own WAVE file writer based on this example More on Wave format: http://soundfile.sapp.org/doc/WaveFormat/