The specification of IAudioClient2::SetClientProperties contains only one parameter but is it not clear to me what to expect from the API given the existing documentation. The parameter is given by:
typedef struct AudioClientProperties {
UINT32 cbSize;
BOOL bIsOffload;
AUDIO_STREAM_CATEGORY eCategory;
AUDCLNT_STREAMOPTIONS Options;
} AudioClientProperties;
I have a capture client and am trying to understand the exact consequence of using different combinations of eCategory
and Options
.
First of all: if I don't call SetClientProperties
on my stream; what are then the default settings? Assume that there existed a corresponding GetClientProperties
, is is possible to say what it would return?
If I set the stream category to AudioCategory_Speech
and the stream option to AUDCLNT_STREAMOPTIONS_RAW
, the manual states that
The audio stream is a 'raw' stream that bypasses
all signal processing except for endpoint specific,
always-on processing in the Audio Processing Object (APO), driver, and hardware.
Does that mean that any processing done by the Signal Enhancements is bypassed or is it some other type of built-in signal processing that is bypassed? I guess I don't really understand the endpoint specific,always-on
part above.
Also, if I instead use AudioCategory_Communications
and AUDCLNT_STREAMOPTIONS_RAW
, are these two contradictive in any way? To me it feels as if AudioCategory_Communications
should enable components useful for VoIP (e.g. AGC, NS, etc.) while the AUDCLNT_STREAMOPTIONS_RAW
flag means "keep the audio path as clean as possible"?
Perhaps I can rephrase the last question. What is the difference in final behavior between using AudioCategory_Communications
+ AUDCLNT_STREAMOPTIONS_RAW
and using AudioCategory_Speech
+ AUDCLNT_STREAMOPTIONS_RAW
?
The eCategory has behavioral implications that go beyond audio effects. For example, if you have a VOIP app and you start an AudioCategory_Communications stream, that will cause movie apps to pause or be ducked, whether or not you use AUDCLNT_STREAMOPTIONS_RAW.
If your capture client is for VOIP, you want AudioCategory_Communications. If your capture client is for voice command or dictation, you want AudioCategory_Speech.
AUDCLNT_STREAMOPTIONS_RAW is only for very narrow circumstances. Usually you would welcome whatever audio processing was the default for your chosen eCategory.
On the other hand, if the ins and outs of audio processing are SUPER important to you, to the point where you are individually evaluating audio drivers on specific hardware, you may determine that certain specific models of computer have audio processing that doesn't work for your app.
In such a case (which should be rare), you should do two things:
Your app can query for what audio effects would be applied to its chosen stream category, both in normal mode and raw mode, using the audio effects discovery API. There's a sample here: https://github.com/microsoftarchive/msdn-code-gallery-microsoft/tree/master/Official%20Windows%20Platform%20Sample/Audio%20effects%20discovery%20sample
The default, if you do not call IAudioClient2::SetClientProperties, is eCategory = AudioCategory_Other, which is usually not what you want.