I have two audio files originated from one file that i divided into a "signal" file and a "noise" file- the background.I need to know the dominant frequencies, Distributions pattern or the Frequencies of that recording in order to be able to compare different sounds emitted from different animals.
I have performed a fft on each file, and then deducted the background noises from the signal.
I don't care what happens bellow 20kHz and above 100 kHz to me they are noise to be discarded.
Amplitude is something that i can not control, so each recording must be normalized.
what is the best way to normalize this data and make the comparisons between different recordings statistically viable ?
function bindel=binset(raw_data_val,signal,noise)
%in case all the recording is only noise
if isempty(signal)
bindel=nan;
return
end
%frequancy of sampling
%Fs= 250000;
%extract the signal parts and noise parts
%"signal" is an index array of all the elemnts of the
%"raw data" array that contain a signal
signal_data=raw_data_val(signal);
noise_data=raw_data_val(noise);
%determine the size of the signal array
L= size(signal_data,1);
NFFT = 2^nextpow2(L(1,1));
Y1 = fft(signal_data,NFFT)/L(1,1);
del1=smooth(2*abs(Y1(1:NFFT/2+1)));
Y2 = fft(noise_data,NFFT)/L(1,1);
del2=smooth(2*abs(Y2(1:NFFT/2+1)));
del=del1-del2;
%combine the data into 125 bindels
binsum=size(del)/125;
bindel=zeros(1,125);
for j=1:125,
bindel(j)= sum(del((j-1)*floor(binsum(1,1))+1:j*floor(binsum(1,1))));
end
%%%deleting low freuqencies- testing filter set to change
%%%everything bellow 20 khz to zero
%%%normalizing between 1 to 0
bindel(1:20)=0;
bindel(100:end)=0;
norm_bin=(bindel - min(bindel)) / ( max(bindel) - min(bindel) );
bindel=norm_bin;
end
I don't think there is a best way to normalize spectral data (depends the question you are trying to answer), but given that you don't care for absolute amplitude but rather the distribution of dominant frequencies I would rely on density and normalize by the sum of your spectrum:
norm_bin = bindel / sum(bindel)
I am assuming that your NFFT is the same for all recordings you compare, if this is not the case normalize taking NFFT into account:
norm_bin = bindel / mean(bindel)