signal-processing speech-recognition telephony audio-processing

How to emulate telephone channel 8k speech given 16k microphone speech recording

I have a task of emulating 8k landline/cellular/VoIP speech audio given 16k microphone recording of that speech. What are the main stages for emulating it? I've found this torchaudio tutorial on such augmentation, and it is the most detailed instructions on how to do it.

Finaly I see following 16k mic -> 8k tel conversion pipeline:

16k -> 8k resampling
Applying RIR (room impulse response to simulate reverberations) [OPTIONAL]
Applying noise [OPTIONAL]
Applying sox compand filter (is it needed? what other parameters might be used?)
Apply codecs (GSM, g72*, SILK, OPUS, etc.)

What should be added? Equalization, some special filters, packet loss concealment emulation? May be there is existing Matlab scripts or libs for such augmentation?

Solution

Assuming you have a wave file


from scipy.signal import lfilter, butter
from scipy.io.wavfile import read,write
from numpy import array, int16

def butter_params(low_freq, high_freq, fs, order=5):
    nyq = 0.5 * fs
    low = low_freq / nyq
    high = high_freq / nyq
    b, a = butter(order, [low, high], btype='band')
    return b, a

def butter_bandpass_filter(data, low_freq, high_freq, fs, order=5):
    b, a = butter_params(low_freq, high_freq, fs, order=order)
    y = lfilter(b, a, data)
    return y

def apply_telephony_effect(f1, f2):
    fs,audio = read(f1)
    low_freq = 300.0
    high_freq = 3000.0
    filtered_signal = butter_bandpass_filter(audio, low_freq, high_freq, fs, order=6)
    write(f2,fs,array(filtered_signal,dtype=int16))

you can create another

apply_telephony_effect('input.wav', 'output.wav')

The output will sound like telephone.