I am only getting 735 frequencies of audio data each 0.03 seconds.
I am trying to figure how how I can get the frequency spectrum in each frame. The code below only returns 735 of data each frame (because samples_in_frame
is 735, that's how many samples in each frame), but I want the whole ~20,000hz for 0.03 seconds of samples. How would I go about doing this?
path
: path to the wave file
second_start
: start point to process in seconds, 0
second_length
: how long to process in seconds, 10
frame_rate
: how many frames in each second, 60
fn fft_this(mut buffer: Vec<Complex<f64>>, samples_in_frame: usize) -> Vec<f64> {
let mut planner = FftPlanner::new();
let fft = planner.plan_fft_forward(samples_in_frame);
fft.process(&mut buffer[..]);
let mut frame_data: Vec<f64> = Vec::with_capacity(samples_in_frame);
for i in 0..samples_in_frame {
frame_data.push(buffer[i].norm())
}
return frame_data;
}
#[tauri::command]
pub fn analyze(
path: &str,
second_start: f64,
second_length: f64,
frame_rate: f64,
) -> Vec<Vec<f64>> {
let wave_file: Wave64 = Wave64::load(path).expect("Could not load wave.");
let samples_start: f64 = second_start * wave_file.sample_rate();
let samples_in_each_frame: usize = (wave_file.sample_rate() / frame_rate) as usize;
let frames_in_length: usize = (second_length * frame_rate) as usize;
let mut wave_buffer: Vec<Vec<f64>> = vec![];
for frame_index in 0..frames_in_length {
let mut frame_buffer: Vec<Complex<f64>> = vec![];
for sample_index in 0..samples_in_each_frame {
let buffer_index: usize = frame_index * sample_index + samples_start as usize;
frame_buffer.push(Complex {
re: wave_file.at(0, buffer_index),
im: 0.0,
});
}
let processed_frame: Vec<f64> = fft_this(frame_buffer, samples_in_each_frame);
wave_buffer.push(processed_frame);
}
return wave_buffer;
}
I think you need more background information about FFTs in general.
If you have 735 data points, this data only consists of 735 orthogonal frequencies.
Lets assume those 735 points represent 1 second. Then:
0 Hz
, and the average of all values.1 Hz
, the slowest contained frequency, meaning one full cycle during the sampling period.2 Hz
, meaning two full cycles during the sampling period.367 Hz
, meaning 367 full cycles368 Hz
, meaning 368 full cycles. IMPORTANT: Due to aliasing (see "Sampling Theorem") this is identical to -367 Hz
, meaning, 367 Hz
rotating in the opposite direction!-366 Hz
-365 Hz
-1 Hz
In total those are 735 frequencies. There isn't more information in the signal.
In your case, as your time period is not 1 second but 0.03 seconds, you need to multiply your frequencies by 1/0.03
= 33.33
. So you get:
0 Hz
33.33 Hz
66.66 Hz
12200 Hz
12233.33 Hz
-12233.33 Hz
-12200 Hz
-66.66 Hz
-33.33 Hz
There simply isn't more information in your samples.
Additional important info:
For the FFT, it looks like the signal you give to it is repeating endlessly. So if your 735 samples aren't actually repeating (which I guess they aren't), you need to apply a window function to reduce the artifacts you get from odd frequencies. For example, a clean 1.5 Hz
signal in the 1 second case will give you some weird overtones because applying no window function is equivalent to applying a rectangular window function, which has horrible overtones. More info here.
For visual learners, I can strongly recommend the videos of 3Blue1Brown about the fourier transform.