Search code examples
octavefftgnusmoothing

GNU Octave: 1/N Octave Smoothing of actual FFT Data (not the representation of it)


I would like to smooth an Impulse Response audio file. The FFT of the file shows that it is very spikey. I would like to smooth out the audio file, not just its plot, so that I have a smoother IR file. I have found a function that shows the FFT plot smoothed out. How could this smoothing be applied to the actual FFT data and not just to the plot of it?

[y,Fs] = audioread('test\test IR.wav');

function x_oct = smoothSpectrum(X,f,Noct)
%SMOOTHSPECTRUM Apply 1/N-octave smoothing to a frequency spectrum
    %% Input checking
    assert(isvector(X), 'smoothSpectrum:invalidX', 'X must be a vector.');
    assert(isvector(f), 'smoothSpectrum:invalidF', 'F must be a vector.');
    assert(isscalar(Noct), 'smoothSpectrum:invalidNoct', 'NOCT must be a scalar.');
    assert(isreal(X), 'smoothSpectrum:invalidX', 'X must be real.');
    assert(all(f>=0), 'smoothSpectrum:invalidF', 'F must contain positive values.');
    assert(Noct>=0, 'smoothSpectrum:invalidNoct', 'NOCT must be greater than or equal to 0.');
    assert(isequal(size(X),size(f)), 'smoothSpectrum:invalidInput', 'X and F must be the same size.');

    %% Smoothing

    % calculates a Gaussian function for each frequency, deriving a
    % bandwidth for that frequency

    x_oct = X; % initial spectrum
    if Noct > 0 % don't bother if no smoothing
        for i = find(f>0,1,'first'):length(f)
            g = gauss_f(f,f(i),Noct);
            x_oct(i) = sum(g.*X); % calculate smoothed spectral coefficient
        end
        % remove undershoot when X is positive
        if all(X>=0)
            x_oct(x_oct<0) = 0;
        end
    end
endfunction

function g = gauss_f(f_x,F,Noct)
% GAUSS_F calculate frequency-domain Gaussian with unity gain
% 
%   G = GAUSS_F(F_X,F,NOCT) calculates a frequency-domain Gaussian function
%   for frequencies F_X, with centre frequency F and bandwidth F/NOCT.

    sigma = (F/Noct)/pi; % standard deviation
    g = exp(-(((f_x-F).^2)./(2.*(sigma^2)))); % Gaussian
    g = g./sum(g); % normalise magnitude

endfunction

% take fft
Y = fft(y);
% keep only meaningful frequencies
NFFT = length(y);
if mod(NFFT,2)==0
    Nout = (NFFT/2)+1;
else
    Nout = (NFFT+1)/2;
end
Y = Y(1:Nout);
f = ((0:Nout-1)'./NFFT).*Fs;
% put into dB
Y = 20*log10(abs(Y)./NFFT);
% smooth
Noct = 12;
Z = smoothSpectrum(Y,f,Noct);
% plot
semilogx(f,Y,'LineWidth',0.7,f,Z,'LineWidth',2.2);
xlim([20,20000])
grid on

PS. I have Octave GNU, so I don't have the functions that are available with Matlab Toolboxes.

Here is the test IR audio file.

FFT with 1/12 Octave Smoothing super-imposed


Solution

  • I think I found it. Since the FFT of the audio file (which is real numbers) is symmetric, with the same real part on both sides but opposite imaginary part, I thought of doing this:

    • take the FFT, keep the half of it, and apply the smoothing function without converting the magnitudes to dB
    • then make a copy of that smoothed FFT, and invert just the imaginary part
    • combine the two parts so that I have the same symmetric FFT as I had in the beginning, but now it is smoothed
    • apply inverse FFT to this and take the real part and write it to file.

    Here is the code:

    [y,Fs] = audioread('test IR.wav');
    
    function x_oct = smoothSpectrum(X,f,Noct)
        x_oct = X; % initial spectrum
        if Noct > 0 % don't bother if no smoothing
            for i = find(f>0,1,'first'):length(f)
                g = gauss_f(f,f(i),Noct);
                x_oct(i) = sum(g.*X); % calculate smoothed spectral coefficient
            end
            % remove undershoot when X is positive
            if all(X>=0)
                x_oct(x_oct<0) = 0;
            end
        end
    endfunction
    
    function g = gauss_f(f_x,F,Noct)
        sigma = (F/Noct)/pi; % standard deviation
        g = exp(-(((f_x-F).^2)./(2.*(sigma^2)))); % Gaussian
        g = g./sum(g); % normalise magnitude
    endfunction
    
    % take fft
    Y = fft(y);
    
    % keep only meaningful frequencies
    NFFT = length(y);
    if mod(NFFT,2)==0
        Nout = (NFFT/2)+1;
    else
        Nout = (NFFT+1)/2;
    end
    Y = Y(1:Nout);
    f = ((0:Nout-1)'./NFFT).*Fs;
    
    % smooth
    Noct = 12;
    Z = smoothSpectrum(Y,f,Noct);
    
    % plot
    semilogx(f,Y,'LineWidth',0.7,f,Z,'LineWidth',2.2);
    xlim([20,20000])
    grid on
    
    #Apply the smoothing to the actual data
    Zreal = real(Z); # real part
    Zimag_neg = Zreal - Z; # opposite of imaginary part
    Zneg = Zreal + Zimag_neg; # will be used for the symmetric Z
    # Z + its symmetry with same real part but opposite imaginary part
    reconstructed = [Z ; Zneg(end-1:-1:2)];
    # Take the real part of the inverse FFT
    reconstructed = real(ifft(reconstructed));
    
    #Write to file
    audiowrite ('smoothIR.wav', reconstructed, Fs, 'BitsPerSample', 24);
    

    Seems to work! :) It would be nice if someone more knowledgeable could confirm that the thinking and code are good :)