Search code examples
gocgo

chromaprint fingerprint of FLAC and MP3


I would like to create an acoustic fingerprint of a FLAC or MP3 file using the chromaprint library in Go. I've been playing around with the following two Go libraries:

Using the following code, a fingerprint of a "raw audio data stream" can be created (where reader is of type io.Reader):

fpcalc := gochroma.New(gochroma.AlgorithmDefault)
defer fpcalc.Close()

fprint, err := fpcalc.Fingerprint(
        fingerprint.RawInfo{
                Src:        reader,
                Channels:   2,
                Rate:       44100,  
                MaxSeconds: 120,
        }       
)

Unfortunately, I haven't been able to figure out what "raw audio data stream" exactly means (my guess: WAVE LPCM streams), but I understand that I can't simply open a FLAC or MP3 file using os.Open and pass the stream to fingerprint.RawInfo.Src. There are some examples, but these work with files ending in .raw.

How can I convert a FLAC (or, secondary, MP3) file/stream to a raw audio data stream in Go? My guess is to use a Go FLAC library like go-flac, but I'm not sure where to start. Any hints are welcome!

EDIT

Via go-flac's GetStreamInfo it should be possible to access a FLAC file's raw audio data, which can then be passed to fingerprint.RawInfo.Src using a reader (I really dislike the fact that go-flac's GetStreamInfo doesn't return an io.Reader; instead it returns []byte, so the entire stream is loaded into memory before further processing can actually happen).

Using the following code, a FLAC file's fingerprint can be calculated (basically what fpcalc does):

package main

import (
    "bytes"
    "fmt"
    "os"

    "github.com/go-fingerprint/fingerprint"
    "github.com/go-fingerprint/gochroma"
    "github.com/go-flac/go-flac"
)

func main() {
    f, err := flac.ParseFile(os.Args[1])
    if err != nil {
        panic(err)
    }

    si, err := f.GetStreamInfo()
    if err != nil {
        panic(err)
    }

    fpcalc := gochroma.New(gochroma.AlgorithmDefault)
    defer fpcalc.Close()

    fprint, err := fpcalc.Fingerprint(
        fingerprint.RawInfo{
            Src:        bytes.NewReader(f.Frames),
            Channels:   uint(si.ChannelCount),
            Rate:       uint(si.SampleRate),
            MaxSeconds: 120,
        },
    )

    fmt.Println(fprint)
}

Unfortunately, the code above does not return the same fingerprint as fpcalc does. What am I doing wrong?


Solution

  • I ended up with the following code which decodes a FLAC file to raw audio data using github.com/eaburns/flac (as Steven Penny pointed out) and then passes the data over to fingerprint/gochroma.

    The resulting fingerprint doesn't seem to be the same as the one reported by fpcalc for the same FLAC file, but when querying the AcoustID database using the generated fingerprint, the result is correct.

    package main
    
    import (
        "bytes"
        "fmt"
        "log"
        "os"
    
        "github.com/eaburns/flac"
        "github.com/go-fingerprint/fingerprint"
        "github.com/go-fingerprint/gochroma"
    )
    
    func main() {
        if len(os.Args) != 2 {
            log.Fatalf("usage: go run fpcalc.go FILE")
        }
    
        f, err := os.Open(os.Args[1])
        if err != nil {
            log.Fatalf("os.Open(%s): %s", os.Args[1], err)
        }
    
        defer f.Close()
    
        d, metadata, err := flac.Decode(f)
        if err != nil {
            log.Fatalf("flac.Decode: %s", err)
        }
    
        fpcalc := gochroma.New(gochroma.AlgorithmDefault)
        defer fpcalc.Close()
    
        fprint, err := fpcalc.Fingerprint(
            fingerprint.RawInfo{
                Src:        bytes.NewBuffer(d),
                Channels:   uint(metadata.NChannels),
                Rate:       uint(metadata.SampleRate),
                MaxSeconds: 120,
            },
        )
        if err != nil {
            log.Fatalf("fpcalc.Fingerprint: %s", err)
        }
    
        fmt.Println(fprint)
    }