Search code examples
wavexiftoolflaclibsndfile

Extracting RIFF data from both .wav and .flac files


Wave files can contain unofficial metadata, such as Sampler Chunk - "smpl": https://sites.google.com/site/musicgapi/technical-documents/wav-file-format#smpl

These are used for audio looping players and samplers avoiding to loading multiple samples.

I have one such file here:

https://github.com/studiorack/basic-harmonica/blob/bf42d5bab7470cc201e3c4b6dee7925b19db6bff/samples/harmonica_1.wav

and a flac file converted using the official flac command line tool: flac harmonica_1.wav --keep-foreign-metadata

https://github.com/studiorack/basic-harmonica/blob/main/samples/harmonica_1.flac

When running these tools I can confirm the metadata exists in each file:

enter image description here https://hexfiend.com

However I do see a different in the number of bytes (I believe as flac has riff inserted in multiple places)

I can also convert the .flac file back to .wav and it is the same size, and contains the metadata: flac harmonica_1.flac --keep-foreign-metadata

When using other tools I can read the data:

$ sndfile-info har.wav
smpl : 60
  Manufacturer : 0
  Product      : 0
  Period       : 20833 nsec
  Midi Note    : 64
  Pitch Fract. : 0
  SMPTE Format : 0
  SMPTE Offset : 00:00:00 00
  Loop Count   : 1
    Cue ID : 131072  Type :  0  Start : 12707  End : 47221  Fraction :     0  Count :     0
  Sampler Data : 0

https://linux.die.net/man/1/sndfile-info

This only works for .wav files. There is a feature request for libsndfile to support 'smpl' in flac files: https://github.com/libsndfile/libsndfile/issues/59

$ metaflac ./har.flac --list
smpl<aQ@�1u�METADATA block #7
  type: 2 (APPLICATION)
  is last: false
  length: 20
  application ID: 72696666
  data contents:

https://xiph.org/flac

However as you can see the result returned are different. I would like a both .wav and .flac RIFF 'smpl' data to be returned in the same format, so I can verify the results match.

https://exiftool.org appears to be tool to do that. But it also produced inconsistent results between .wav and .flac:

$ exiftool -a -G1 -s ./har.wav
[ExifTool]      ExifToolVersion                 : 12.42
[System]        FileName                        : har.wav
[System]        Directory                       : .
[System]        FileSize                        : 95 kB
[System]        FileModifyDate                  : 2022:10:11 21:16:37-07:00
[System]        FileAccessDate                  : 2022:10:15 14:39:46-07:00
[System]        FileInodeChangeDate             : 2022:10:15 14:39:50-07:00
[System]        FilePermissions                 : -rw-r--r--
[File]          FileType                        : WAV
[File]          FileTypeExtension               : wav
[File]          MIMEType                        : audio/x-wav
[RIFF]          Encoding                        : Microsoft PCM
[RIFF]          NumChannels                     : 1
[RIFF]          SampleRate                      : 48000
[RIFF]          AvgBytesPerSec                  : 96000
[RIFF]          BitsPerSample                   : 16
[RIFF]          Manufacturer                    : 0
[RIFF]          Product                         : 0
[RIFF]          SamplePeriod                    : 20833
[RIFF]          MIDIUnityNote                   : 64
[RIFF]          MIDIPitchFraction               : 0
[RIFF]          SMPTEFormat                     : none
[RIFF]          SMPTEOffset                     : 00:00:00:00
[RIFF]          NumSampleLoops                  : 1
[RIFF]          SamplerDataLen                  : 0
[RIFF]          SamplerData                     : (Binary data 20 bytes, use -b option to extract)
[RIFF]          UnshiftedNote                   : 64
[RIFF]          FineTune                        : 0
[RIFF]          Gain                            : 0
[RIFF]          LowNote                         : 0
[RIFF]          HighNote                        : 127
[RIFF]          LowVelocity                     : 0
[RIFF]          HighVelocity                    : 127
[RIFF]          Comment                         : Recorded on 7/10/2022 in Edison.
[RIFF]          Software                        : FL Studio 20
[Composite]     Duration                        : 0.99 s

and for flac

$ exiftool -a -G1 -s ./har.flac
[ExifTool]      ExifToolVersion                 : 12.42
[System]        FileName                        : har.flac
[System]        Directory                       : .
[System]        FileSize                        : 83 kB
[System]        FileModifyDate                  : 2022:10:11 20:59:37-07:00
[System]        FileAccessDate                  : 2022:10:15 14:44:12-07:00
[System]        FileInodeChangeDate             : 2022:10:15 14:42:26-07:00
[System]        FilePermissions                 : -rw-r--r--
[File]          FileType                        : FLAC
[File]          FileTypeExtension               : flac
[File]          MIMEType                        : audio/flac
[FLAC]          BlockSizeMin                    : 4096
[FLAC]          BlockSizeMax                    : 4096
[FLAC]          FrameSizeMin                    : 3442
[FLAC]          FrameSizeMax                    : 6514
[FLAC]          SampleRate                      : 48000
[FLAC]          Channels                        : 1
[FLAC]          BitsPerSample                   : 16
[FLAC]          TotalSamples                    : 47222
[FLAC]          MD5Signature                    : f89646c0d3056ec38c3e33ca79299253
[Vorbis]        Vendor                          : reference libFLAC 1.4.1 20220922
[Composite]     Duration                        : 0.98 s

How can I read this data consistently regardless of .flac or .wav file?


Solution

  • I was helped by the creator of exiftool here: https://exiftool.org/forum/index.php?topic=14064.0

    In short flac riff blocks were stored in a custom metadata format which exiftool could parse but needed a custom .ExifTool_config file

    The creator added the necessary changes in a commit: https://github.com/exiftool/exiftool/commit/5c2467fa6cdb38233793884e80cee9abf4da48e6#diff-0c24c6846e8207ad8d090e564fdc366dad6386f2ef7c51eea5aa0d72d970ff11

    The latest release of ExifTool 12.49 now parses .wav and .flac loop data! "Decode 'riff' metadata blocks in FLAC audio files" https://exiftool.org/history.html

    Usage:

    exiftool ./har.wav
    exiftool ./har.flac
    

    Output:

    Encoding                        : Microsoft PCM
    Num Channels                    : 1
    Sample Rate                     : 48000
    Avg Bytes Per Sec               : 96000
    Bits Per Sample                 : 16
    Manufacturer                    : 0
    Product                         : 0
    Sample Period                   : 20833
    MIDI Unity Note                 : 64
    MIDI Pitch Fraction             : 0
    SMPTE Format                    : none
    SMPTE Offset                    : 00:00:00:00
    Num Sample Loops                : 1
    Sampler Data Len                : 0
    Sampler Data                    : (Binary data 20 bytes, use -b option to extract)
    Unshifted Note                  : 64
    Fine Tune                       : 0
    Gain                            : 0
    Low Note                        : 0
    High Note                       : 127
    Low Velocity                    : 0
    High Velocity                   : 127
    Acidizer Flags                  : One shot
    Root Note                       : High C
    Beats                           : 2
    Meter                           : 4/4
    Tempo                           : 0
    Comment                         : Recorded on 7/10/2022 in Edison.
    Software                        : FL Studio 20
    Duration                        : 0.87 s