Fuzzy match portions of an image

I need to determine the existance of a smaller image inside a larger image.

The match must be fuzzy and preferrably I should know how well it matched (%), but I can calculate the accuracy my self after the match is done if required.

My goal is to match a FFT (Fast Fourier Transfrom) spectrogram (visualization of frequencies in music) with the original music track. The small image I'm matching with is only a subset (both in time and frequency range) of the original track (like a cutout of the image below).

Where should I start? Are the same algorithms used for object recognition suitable for this task?

I am primarily looking for C#/.Net libraries/samples, but also information on implementations and problems/pitfalls.

I am considering using artificual neural networks for training the recognition. Any thoughts?

Example of what the images I want to match may look like:

Solution

I think that treating this problem as an image recognition problem is ignoring the underlying structure of the problem. Specifically you may want to look at how Shazam addresses the problem. This question on Quora has a couple of interesting links:

http://www.quora.com/How-does-Shazam-work

First, an academic paper describing the algorithm. You will notice that they also start with a spectrogram, but from there pick a small number of landmarks using an algorithm tailored to the problem. They then essentially use that as a fingerprint id into a database.

Second, an article on Slate that is understandably at a higher level, but may still be helpful.