I'm trying to play around with some music clustering algorithms, and I thought that using a feature vector consisting of basically a discretized fft (like discretize the frequencies) would be a good similarity measure. Would this even be useful? Do people know what some good audio similarity measures might be?
First of all, you need to decide whether you want fingerprinting (i.e. identity except for some distortion) or similarity (but not identity!) measures.
Also have a look at MFCC, bark scales and so on. There is plenty of literature out there. Go to Amazon, and grab a dedicated book on this topic.