[Note that I use Windows API WAVEFORMATEX
structs for legacy reasons. If you don't use WAVEFORMATEX
with the Media Foundation lib, you won't have the problem described in this post.]
In order to enumerate available MP3 formats for the builtin codec, I used to call MFTranscodeGetAudioOutputAvailableTypes()
and then for each type IMFMediaType::GetRepresentation(AM_MEDIA_TYPE_REPRESENTATION, AM_MEDIA_TYPE ** type_rep)
. As expected, the type_rep
contained the MPEGLAYER3WAVEFORMAT
extension of WAVEFORMATEX
with the proper size.
void add_mp3_format(std::vector<audio::WaveFormat> & formats, IMFMediaType * const output_type) {
AM_MEDIA_TYPE * type_rep = nullptr;
HRESULT hr = output_type->GetRepresentation(AM_MEDIA_TYPE_REPRESENTATION, (void **)&type_rep);
if (SUCCEEDED(hr)) {
if (type_rep && type_rep->formattype == FORMAT_WaveFormatEx && type_rep->pbFormat && type_rep->cbFormat >= sizeof(WAVEFORMATEX)) {
WAVEFORMATEX const * const wfx = (WAVEFORMATEX *)type_rep->pbFormat;
if (wfx->wFormatTag == WAVE_FORMAT_MPEGLAYER3 && wfx->cbSize == MPEGLAYER3_WFX_EXTRA_BYTES) {
MPEGLAYER3WAVEFORMAT const * const mp3 = (LPMPEGLAYER3WAVEFORMAT)wfx;
formats.emplace_back(*mp3, L"MP3");
//} else if (wfx->wFormatTag == WAVE_FORMAT_MPEG_HEAAC && wfx->cbSize >= sizeof(HEAACWAVEINFO) - sizeof(WAVEFORMATEX)) {
// HEAACWAVEINFO const * const aac = (LPHEAACWAVEINFO)wfx;
// formats.emplace_back(*aac, L"AAC");
}
}
output_type->FreeRepresentation(AM_MEDIA_TYPE_REPRESENTATION, (void *)type_rep);
}
}
Later, I used these format structs to set the output types for the encoder, using MFInitMediaTypeFromWaveFormatEx()
and IMFSinkWriter::AddStream()
. This worked fine.
Now I wanted to add AAC formats available with the builtin codec, using the same functions. Unfortunately, that failed because of a bug either in the Media Foundation lib or in the codec itself.
Similar to MPEGLAYER3WAVEFORMAT
above, I expected HEAACWAVEINFO
or even HEAACWAVEFORMAT
structs. But IMFMediaType::GetRepresentation()
omits all the special AAC info and returns only the plain old WAVEFORMATEX
part. Its cbSize
member should be at least 12, but it is 0. Because of that, MFInitMediaTypeFromWaveFormatEx()
fails and encoding is not possible. Hoping for the best and stretching the returned structs into HEAACWAVEINFO
doesn't help, the required info is indeed missing.
How to get the missing AAC format info and make the encoder work?
Instead of relying on IMFMediaType::GetRepresentation(AM_MEDIA_TYPE_REPRESENTATION, AM_MEDIA_TYPE ** type_rep)
, you have to use the IMFAttributes
interface of the enumerated IMFMediaType
to extract the necessary info manually and then fill a HEAACWAVEINFO
with it.
The MF_MT_AAC_*
attributes are the missing parts in GetRepresentation()
. There should also be a blob field MF_MT_USER_DATA
, representing HEAACWAVEFORMAT::pbAudioSpecificConfig
or even the whole extra info in HEAACWAVEINFO
, but it's not there. Perhaps the blob field MF_MT_MPEG4_SAMPLE_DESCRIPTION
should represent HEAACWAVEFORMAT::pbAudioSpecificConfig
, but that's unclear. But encoding works even without that part. If you require, there's more info about that last part here.
void add_aac_format(std::vector<audio::WaveFormat> & formats, IMFMediaType * const output_type) {
UINT32 pt = 0, pli = 0, nc = 0, sps = 0, apbs = 0, ba = 0, bps = 0;
if (SUCCEEDED(output_type->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &nc)) &&
SUCCEEDED(output_type->GetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, &sps)) &&
SUCCEEDED(output_type->GetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, &apbs)) &&
SUCCEEDED(output_type->GetUINT32(MF_MT_AUDIO_BLOCK_ALIGNMENT, &ba)) &&
SUCCEEDED(output_type->GetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, &bps)) &&
SUCCEEDED(output_type->GetUINT32(MF_MT_AAC_PAYLOAD_TYPE, &pt)) &&
SUCCEEDED(output_type->GetUINT32(MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION, &pli)))
{
HEAACWAVEINFO aac_wi {0};
aac_wi.wfx.wFormatTag = WAVE_FORMAT_MPEG_HEAAC;
aac_wi.wfx.nChannels = nc;
aac_wi.wfx.nSamplesPerSec = sps;
aac_wi.wfx.nAvgBytesPerSec = apbs;
aac_wi.wfx.nBlockAlign = ba;
aac_wi.wfx.wBitsPerSample = bps;
aac_wi.wfx.cbSize = sizeof(HEAACWAVEINFO) - sizeof(WAVEFORMATEX);
aac_wi.wPayloadType = pt;
aac_wi.wAudioProfileLevelIndication = pli;
formats.emplace_back(aac_wi, L"AAC");
}
}