Skip to main content
Android
iOS
macOS
Web

Display live subtitles

When interacting with conversational AI in real time, you can enable real-time subtitles to display the conversation content. This page explains how to implement real-time subtitles in your app.

Understand the tech

To simplify subtitle integration, Agora provides an open-source subtitle processing module. By integrating this module into your project and calling its APIs, you can quickly enable real-time subtitles. The following figure illustrates how the subtitle module interacts with your app and Agora SD-RTN™.

Subtitles module workflow

Prerequisites

Before you begin, make sure you have implemented the Conversational AI Engine REST quickstart.

Implementation

This section describes how to receive subtitle content from the subtitle processing module and display it on your app UI.

Copy ConverationSubtitleController.kt and MessageParser.kt files to your project and import the module before calling the module API.

Integrate the subtitle processing module

Inherit your subtitle UI module from the IConversationSubtitleCallback interface and implement the onSubtitleUpdated method to handle the message rendering logic.

Implement subtitle UI rendering logic


_9
class CovMessageListView @JvmOverloads constructor(
_9
context: Context,
_9
attrs: AttributeSet? = null,
_9
defStyleAttr: Int = 0
_9
) : LinearLayout(context, attrs, defStyleAttr), IConversationSubtitleCallback {
_9
override fun onSubtitleUpdated(subtitle: SubtitleMessage) {
_9
// Implement your UI rendering logic here
_9
}
_9
}

Create a subtitle processing module instance

When entering the call page, create an ConversationSubtitleController instance, which monitors the subtitle message callback internally and passes the subtitle information to your UI through the onSubtitleUpdated callback of IConversationSubtitleCallback.


_9
override fun onCreate(savedInstanceState: Bundle?) {
_9
val subRenderController = ConversationSubtitleController(
_9
SubtitleRenderConfig(
_9
rtcEngine = rtcEngine,
_9
SubtitleRenderMode.word,
_9
mBinding?.messageListView
_9
)
_9
)
_9
}

Release resources

Call the reset method at the end of each call to clean up the cache. When leaving the call page, call release to release resources.


_2
subRenderController.reset()
_2
subRenderController.release()

Reference

This section contains content that completes the information on this page, or points you to documentation that explains other aspects to this product.

API Reference

This section provides API reference documentation for the subtitles module.

ConversationSubtitleController


_3
class ConversationSubtitleController (
_3
private val config: SubtitleRenderConfig
_3
)

SubtitleRenderConfig


_5
data class SubtitleRenderConfig (
_5
val rtcEngine: RtcEngine,
_5
val renderMode: SubtitleRenderMode?,
_5
val callback: IConversationSubtitleCallback?
_5
)

SubtitleRenderMode


_4
enum class SubtitleRenderMode {
_4
Text,
_4
Word
_4
}

  • Text: Sentence-by-sentence rendering mode. The subtitle content received by the callback is fully rendered on the UI.
  • Word: Word-by-word rendering mode. The subtitle content received by the callback is rendered word by word on the UI.
caution

Using the word-by-word rendering mode (Word) requires that your chosen TTS vendor supports word-by-word output, otherwise, it will automatically fall back to sentence-by-sentence rendering mode (Text).

IConversationSubtitleCallback

The callback interface for subtitle content update events.


_3
interface IConversationSubtitleCallback {
_3
fun onSubtitleUpdated(subtitle: SubtitleMessage)
_3
}

  • onSubtitleUpdated: Subtitle update callback.

SubtitleMessage


_6
data class SubtitleMessage(
_6
val turnId: Long,
_6
val userId: Int,
_6
val text: String,
_6
var status: SubtitleStatus
_6
)

  • turnId: The identifier of the conversation turn. One conversation turn between the user and the agent corresponds to one turnId, and follows the following rules:

    • turnId = 0, This is the welcome message of the agent, and there is no subtitle for the user.
    • turnId ≥ 1, The subtitles for the user or agent in that round. Use the userId to display the user's subtitles before the agent's subtitles, and then repeat the process for round +1.
    caution

    There is no guarantee that callbacks will be in strictly increasing turnId order. If you encounter out-of-order situations, implement the sorting logic yourself.

  • userId: The user ID associated with this subtitle message. In the current version, 0 represents the user, non-zero represents the agent ID.

  • text: Subtitle text content.

  • status: The current status of the subtitles. See SubtitleStatus for details.

SubtitleStatus

Use SubtitleStatus for special UI processing based on the status, such as displaying an interruption mark at the end of the subtitle.


_5
enum class SubtitleStatus {
_5
Progress,
_5
End,
_5
Interrupted
_5
}

  • Progress: The subtitles are still being generated; the user or agent has not finished speaking.
  • End: The subtitle generation is complete; the user or agent has finished speaking.
  • Interrupted: The subtitles were interrupted before completion; the user actively stopped the agent’s response.
vundefined