As Double Helix said, FFT converts time domain to frequency domain. In other words, a small chunk of waveform is converted into a *linear* spectral graph, so there are some important implications to bear in mind. An example will probably make more sense than dry statements (feel free to look away now
):
A decent FFT-based filter offers a graph and some kind of resolution setting. The graph is easy, it's what you'd like the filter to do to your audio. How well it will do that, however, depends on the resolution setting *and* the sampling rate - together, they determine the result.
Assuming 44.1 kHz sampling rate and a fairly typical resolution of 4096 (which is really the number of samples, the length of an audio chunk converted as a unit - usually a power of two for fastest processing), the following will apply:
- the frequency resolution will be around 10.8 Hz (4096 frequency bands, from 44100/4096)
- the time window (audio chunk length) will be around 93 ms (4096/44100)
I hope you can see that there is always going to be a trade-off between frequency and time resolution: increasing frequency resolution will necessarily decrease time resolution, meaning you have to choose between precision and smear/lag/delay.
On top of this, since FFT is linear, the lower the frequency you're targetting the higher the resolution you have to use, leading to longer delay at that frequency... eg using 1024 bands, 44100/1024 = ~43 - meaning the very first graph band would cover everything from 0 to 43 Hz, the next 43 to 86 and so on.
On the other hand, FFT is great for weird transforms as well as, say, -3 dB/octave filtering (impossible with regular filters, as each pole is worth -6 dB/octave).
I know this appears heavy duty at first but understanding FFT can only help you decide when to use it