Skip to main content

Migrate from Real-Time STT 5.x

Version 5.x of Real-Time STT focused on flexibility, reserving many parameters and capabilities for future expansion, such as cross-channel subtitle push. However, this flexibility introduced complexity in integration. In contrast, v6.x prioritizes simplicity by removing redundant fields while also introducing new features, such as supporting UID language functions and eliminating destination fields.

info

The Real-Time STT v5.x has entered a frozen development state. Agora strongly recommends migrating to v6.x.

This page explains how to migrate from v5.x to v6.x.

What has changed

The Real-Time STT v6.x introduces a more streamlined approach by removing certain functions that were available in v5.x. Here’s what has changed:

  1. Cross-channel audio subscription and subtitle push removed

    • v5.x: You could subscribe to audio from Channel A and push the transcribed or translated subtitles to Channel B.
    • v6.x: This is no longer supported. The Real-Time STT v6.x only allows subscribing to audio within the same channel and pushing subtitles to that same channel.
  2. Multi-OSS/S3 storage for subtitle files removed

    • v5.x: You could store subtitle files in multiple OSS/S3 storage locations.
    • v6.x: You can only upload subtitles to a single OSS/S3 storage location.
  3. Redundant fields removed

    • Some unnecessary fields such as config.recognizeConfig.model and audio.subscribeSource have been removed to simplify real-time REST API.

These changes are aimed at reducing complexity and making real-time STT more efficient.

Migrate to v6.x manually

The following figure highlights the v5.x configuration changes.

STT version comparison

  1. Channel and user authentication changes

    • v5.x: channelName, uid, token are specified under agoraRtcConfig.
    • v6.x: Replaced by channelName, subBotUid, and subBotToken inside the rtcConfig structure.
  2. Idle time setting

    • v5.x: maxIdleTime is specified under agoraRtcConfig.
    • v6.x: Replaced by the maxIdleTime field at the top level of the structure.
  3. Language configuration

    • v5.x: language is a comma-separated string for multiple languages under recognizeConfig.
    • v6.x: Replaced by languages array at the top level. Each language is now an individual item in the array.
  4. Publisher bot authentication

    • v5.x: uid and token are specified under agoraRTCDataStream.
    • v6.x: Replaced by pubBotUid and pubBotToken in the rtcConfig structure.
  5. Storage configuration changes

    • v5.x: storageConfig is under config.recognizeConfig.output.cloudStorage
    • v6.x: Replaced by captionConfig.storage, with all fields mapped one-to-one.
  6. Translation configuration

    • v5.x: translateConfig structure is under config
    • v6.x: Directly replaced by the translateConfig structure at the top level.

The equivalent configuration for v6.x is as follows:


_35
{
_35
"languages": [
_35
"en-US"
_35
],
_35
"maxIdleTime": 21600,
_35
"rtcConfig": {
_35
"channelName": "agora-test",
_35
"subBotToken": "1111-token",
_35
"subBotUid": "1111",
_35
"pubBotUid": "2222",
_35
"pubBotToken": "2222-token"
_35
},
_35
"translateConfig": {
_35
"languages": [
_35
{
_35
"source": "en-US",
_35
"target": [
_35
"zh-CN"
_35
]
_35
}
_35
]
_35
},
_35
"captionConfig": {
_35
"storage": {
_35
"accessKey": "your_storage_access_key",
_35
"secretKey": "your_storage_secret_key",
_35
"bucket": "your_storage_bucket_name",
_35
"vendor": 1,
_35
"region": 0,
_35
"fileNamePrefix": [
_35
"test-file"
_35
]
_35
}
_35
}
_35
}

Migrate using a Go script

To convert a v5.x REST API to v6.x, Agora provides a Go script. To use this script:

  1. Replace the highlighted area in the rectangular box with your v5.x REST API.
  2. Click Run to generate the v6.x request body.
  3. Use the v6.x REST API to start an STT task and compare it with the v5.x version to check for differences.

Go script

vundefined