Display live subtitles

When interacting with conversational AI in real time, you can enable real-time subtitles to display the conversation content. This page explains how to implement real-time subtitles in your app.

Understand the tech

To simplify subtitle integration, Agora provides an open-source subtitle processing module. By integrating this module into your project and calling its APIs, you can quickly enable real-time subtitles. The following figure illustrates how the subtitle module interacts with your app and Agora SD-RTN™.

Subtitles module workflow


Before you begin, make sure you have implemented the Conversational AI Engine REST quickstart.


This section describes how to receive subtitle content from the subtitle processing module and display it on your app UI.

Copy ConverationSubtitleController.kt and MessageParser.kt files to your project and import the module before calling the module API.

Integrate the subtitle processing module

Inherit your subtitle UI module from the IConversationSubtitleCallback interface and implement the onSubtitleUpdated method to handle the message rendering logic.

Implement subtitle UI rendering logic

class CovMessageListView @JvmOverloads constructor(
context: Context,
attrs: AttributeSet? = null,
defStyleAttr: Int = 0
) : LinearLayout(context, attrs, defStyleAttr), IConversationSubtitleCallback {
override fun onSubtitleUpdated(subtitle: SubtitleMessage) {
// Implement your UI rendering logic here

Create a subtitle processing module instance

When entering the call page, create an ConversationSubtitleController instance, which monitors the subtitle message callback internally and passes the subtitle information to your UI through the onSubtitleUpdated callback of IConversationSubtitleCallback.

override fun onCreate(savedInstanceState: Bundle?) {
val subRenderController = ConversationSubtitleController(
rtcEngine = rtcEngine,

Release resources

Call the reset method at the end of each call to clean up the cache. When leaving the call page, call release to release resources.



This section contains content that completes the information on this page, or points you to documentation that explains other aspects to this product.

API Reference

This section provides API reference documentation for the subtitles module.


class ConversationSubtitleController (
private val config: SubtitleRenderConfig


data class SubtitleRenderConfig (
val rtcEngine: RtcEngine,
val renderMode: SubtitleRenderMode?,
val callback: IConversationSubtitleCallback?


enum class SubtitleRenderMode {

  • Text: Sentence-by-sentence rendering mode. The subtitle content received by the callback is fully rendered on the UI.
  • Word: Word-by-word rendering mode. The subtitle content received by the callback is rendered word by word on the UI.

Using the word-by-word rendering mode (Word) requires that your chosen TTS vendor supports word-by-word output, otherwise, it will automatically fall back to sentence-by-sentence rendering mode (Text).


The callback interface for subtitle content update events.

interface IConversationSubtitleCallback {
fun onSubtitleUpdated(subtitle: SubtitleMessage)

  • onSubtitleUpdated: Subtitle update callback.


data class SubtitleMessage(
val turnId: Long,
val userId: Int,
val text: String,
var status: SubtitleStatus

  • turnId: The identifier of the conversation turn. One conversation turn between the user and the agent corresponds to one turnId, and follows the following rules:

    • turnId = 0, This is the welcome message of the agent, and there is no subtitle for the user.
    • turnId ≥ 1, The subtitles for the user or agent in that round. Use the userId to display the user's subtitles before the agent's subtitles, and then repeat the process for round +1.

    There is no guarantee that callbacks will be in strictly increasing turnId order. If you encounter out-of-order situations, implement the sorting logic yourself.

  • userId: The user ID associated with this subtitle message. In the current version, 0 represents the user, non-zero represents the agent ID.

  • text: Subtitle text content.

  • status: The current status of the subtitles. See SubtitleStatus for details.


Use SubtitleStatus for special UI processing based on the status, such as displaying an interruption mark at the end of the subtitle.

enum class SubtitleStatus {

  • Progress: The subtitles are still being generated; the user or agent has not finished speaking.
  • End: The subtitle generation is complete; the user or agent has finished speaking.
  • Interrupted: The subtitles were interrupted before completion; the user actively stopped the agent’s response.