Scenario
Background audio is being transmitting when the local user is not speaking.
Resolution
In most cases, customers do not need to adjust the voice activity detection (VAD) settings of the Vivox SDK. The VAD analyses the local captured audio to determine when the local user has started and stopped speaking. Transmitted audio is limited to these periods. This limits bandwidth and helps users have a better overall experience. If the default VAD settings are not working, the ability to adjust VAD is provided.
Auto VAD
To begin the VAD tuning process, try enabling the Automatically Adjusted VAD before trying your own values for the VAD settings. Do this by enabling the vad_auto
in the vx_req_aux_set_vad_properties
call, as detailed in the examples in the following sections.
With this Auto VAD enabled, the VAD logic tries to continually learn from the captured audio and adjusts the hangover, sensitivity, and noise floor values automatically. This learning process will take several seconds before adjustments start to be made.
Any values you provide for the other parameters are ignored when the vad_auto
is enabled nor are the values updated to what the Auto VAD has calculated.
Manual VAD
If the Automatically Adjusted VAD does not help, disable the vad_auto
setting and provide your own values for vad_hangover
, vad_sensitivity
, and vad_noise_floor
.
vad_hangover
The hangover time is the time (in milliseconds) that it takes for the VAD to switch back to silence from speech mode after the last speech frame has been detected. The default value is 2000.
vad_sensitivity
The sensitivity is a dimensionless value between 0 and 100 and indicates the sensitivity of the VAD. Increasing this value corresponds to decreasing the sensitivity of the VAD (0 is most sensitive, and 100 is least sensitive). Higher values of sensitivity require louder audio to trigger the VAD. The default value is 43.
vad_noise_floor
The noise floor is a dimensionless value between 0 and 20000 that controls how the VAD separates speech from background noise. The noise floor controls how the VAD determines the difference between voice activity from background noise. It does not reduce or eliminate background noise when capturing. It is meant to help prevent capture because of a noisy background environment. The VAD makes an estimate for the background noise and then sets the voice activity threshold above that noise estimate. The danger of higher noise floor values is that it can lead to consistent speech being considered as noise so the voice activity threshold can rise above actual speech and not capture actual voice activity. Lower values assume that the user is in a quieter environment where the audio is only speech. Higher values assume a noisy background environment. The default value is 576.
Important: Changes to the VAD noise floor settings do not take effect for currently joined channels. Also, if the ability to change VAD settings is presented to the end-user, indicate that noise floor changes only take effect in the next voice session or only allow changing the noise floor channel when the client is not in-channel.
Changing VAD Settings
Core
To make adjustments to the VAD settings, you work with the vx_req_aux_set_vad_properties
and vx_req_aux_get_vad_properties
structs in the Vivox Core SDK. You can read more about the VAD requests and responses in the Vivox Core API documentation by searching for the following terms:
- vx_req_aux_set_vad_properties
- vx_req_aux_get_vad_properties
Users who are familiar with the Vivox Core SDK can use the same requests and response handling that are used in the rest of the Core SDK. For more information, see Requests and responses in the Vivox Developer Documentation.
Enabling VAD Automatic Parameter Selection
vx_req_aux_set_vad_properties_t *req;
vx_req_aux_set_vad_properties_create(&req);
req->vad_hangover = 2000;
req->vad_sensitivity = 43;
req->vad_noise_floor = 576;
req->vad_auto = 1;
vx_issue_request(&req->base);
Note: Although the VAD ignores the other settings when vad_auto
is enabled, valid parameters must be provided or the call will return an error. In this example, the default values for the vad_hangover
, vad_sensitivity
, and vad_noise_floor
have been provided.
Setting the VAD Hangover, Sensitivity, and Noise Floor
vx_req_aux_set_vad_properties_t *req;
vx_req_aux_set_vad_properties_create(&req);
req->vad_hangover = 2000;
req->vad_sensitivity = 43;
req->vad_noise_floor = 576;
req->vad_auto = 0;
vx_issue_request(&req->base);
Note: In this example, the default values for the vad_hangover
, vad_sensitivity
, and vad_noise_floor
have been provided. vad_auto
has been disabled.
Unity
To set the VAD properties, you need to add support for the vx_req_aux_set_vad_properties
request from the Core SDK. You can read more about the request and responses in the Vivox Core API documentation by searching for the following terms:
- vx_req_aux_set_vad_properties
- vx_req_aux_get_vad_properties
By using these requests and handling their responses, you should be able to manipulate the VAD mic gating. For more information, see Requests and responses in the Vivox Developer Documentation. Note that although this information refers to the Core SDK, the Vivox Core Unity source package is designed to follow a very similar flow.
You can find examples of requests and responses in the Unity source package by looking under VivoxUnity project. This is located in the source release in vivox_unity\vivox_unity.sln.
To use vx_req_aux_set_vad_properties
and vx_req_aux_get_vad_properties
, you need to add these requests to an existing endpoint or an exposed function of the Unity API or create a new endpoint. To confirm whether your requests are successful, ensure that you handle the responses. You can find numerous examples of request and response handling throughout the source code by searching for how VxClient.Instance.BeginIssueRequest
is used. One possible example is the ILoginSession.BeginAccountSetLoginProperties
call, in which you can see a simple request and response being handled.
When these calls are added, using the calls is similar to what is shown in the Core example.
Unreal
Currently, the Vivox Unreal SDK does not provide an API to adjust the VAD settings. If you want to modify the code for the Unreal wrapper, you can add support for at least the vx_req_aux_set_vad_properties
request from the Core SDK. You can read more about the VAD requests and responses in the Vivox Core API documentation by searching for the following terms:
- vx_req_aux_set_vad_properties
- vx_req_aux_get_vad_properties
By using these requests and handling their responses, you should be able to manipulate the VAD mic gating. For more information, see Requests and responses in the Vivox Developer Documentation.
To use vx_req_aux_set_vad_properties
and vx_req_aux_get_vad_properties
, you need to add these requests to an existing endpoint or an exposed function of the Unity API, or create a new endpoint. To confirm whether your requests are successful, ensure that you handle the responses. You can find examples of request and response handling throughout the source code by searching for how VivoxNativeSdk::IssueRequest
is used. One possible example is the VivoxNativeSdk::SetActiveAudioInputDevice
call, in which you can see a simple request being handled.
When these calls are added, using the calls is similar to what is shown in the Core example.
Comments
0 comments
Article is closed for comments.