Enabling voice search in Cobalt

Cobalt enables voice search through either:

  1. A subset of the MediaRecorder Web API.
  2. A subset of the Speech Recognition Web API

Only one or the other can be used, and we recommend that the MediaRecorder API is followed, as we are considering deprecating the Speech Recognition API.

In both approaches, in order to check whether to enable voice control or not, web apps will call the MediaDevices.enumerateDevices() Web API function within which Cobalt will in turn call a subset of the Starboard SbMicrophone API.

MediaRecorder API

To enable the MediaRecorder API in Cobalt, the complete SbMicrophone API must be implemented, and SbSpeechRecognizerIsSupported() must return false.

Speech Recognition API

In order to provide support for using this API, platforms must implement the Starboard SbSpeechRecognizer API as well as a subset of the SbMicrophone API.

Specific instructions to enable voice search

  1. Implement SbSpeechRecognizerIsSupported() to return true, and implement the SbSpeechRecognizer API.

  2. Implement the following subset of the SbMicrophone API:

    • SbMicrophoneGetAvailable()
    • SbMicrophoneCreate()
    • SbMicrophoneDestroy()

    In particular, SbMicrophoneCreate() must return a valid microphone. It is okay to stub out the other functions, e.g. have SbMicrophoneOpen() return false.

  3. The YouTube app will display the mic icon on the search page when it detects valid microphone input devices using MediaDevices.enumerateDevices().

  4. With SbSpeechRecognizerIsSupported() implemented to return true, Cobalt will use the platform's Starboard SbSpeechRecognizer API implementation, and it will not actually read directly from the microphone via the Starboard SbMicrophone API.

Differences from versions of Cobalt <= 11

In previous versions of Cobalt, there was no way to dynamically disable speech support besides modifying common Cobalt code to dynamically stub out the Speech Recognition API when the platform does not support microphone input. This is no longer necessary, web apps should now rely on MediaDevices.enumerateDevices() to determine whether voice support is enabled or not.