Skip to main content

Interrupt agent

When interacting with an agent, you may need to interrupt the agent to begin a new round of conversation. The Agora Conversational AI Engine supports agent interruption in the following ways:

  • Voice interruption: The engine detects user voice input and automatically stops the agent’s response.
  • Manual interruption: Your app can explicitly stop the agent by calling a REST API or client SDK method—typically triggered by a button tap or custom command.

This page describes how to implement agent interruption in your app.

Voice interruption

Conversational AI Engine supports an intelligent interruption feature that allows a user's voice input to automatically interrupt the speaking agent. This enables quicker response times and more natural, fluid interactions.

To customize interruption behavior, configure the turn_detection parameters when calling Start a conversational AI agent.

To enable AIVAD-based interruption, set turn_detection.config.end_of_speech.mode to "semantic".

The following example shows how to configure turn_detection:


_58
curl --request post \
_58
--url https://api.agora.io/api/conversational-ai-agent/v2/projects/:appid/join \
_58
--header 'Authorization: Basic <your_base64_encoded_credentials>' \
_58
--data '
_58
{
_58
"name": "unique_name",
_58
"properties": {
_58
"channel": "channel_name",
_58
"token": "token",
_58
"agent_rtc_uid": "1001",
_58
"remote_rtc_uids": ["1002"],
_58
"turn_detection": {
_58
"mode": "default",
_58
"config": {
_58
"speech_threshold": 0.5,
_58
"start_of_speech": {
_58
"mode": "vad",
_58
"vad_config": {
_58
"interrupt_duration_ms": 160,
_58
"speaking_interrupt_duration_ms": 320,
_58
"prefix_padding_ms": 800
_58
}
_58
},
_58
"end_of_speech": {
_58
"mode": "semantic",
_58
"semantic_config": {
_58
"silence_duration_ms": 320,
_58
"max_wait_ms": 3000
_58
}
_58
}
_58
}
_58
},
_58
"llm": {
_58
"url": "https://api.openai.com/v1/chat/completions",
_58
"api_key": "",
_58
"system_messages": [
_58
{
_58
"role": "system",
_58
"content": "You are a helpful assistant."
_58
}
_58
],
_58
"params": {
_58
"model": "gpt-4o-mini"
_58
}
_58
},
_58
"tts": {
_58
"vendor": "microsoft",
_58
"params": {
_58
"key": "",
_58
"region": "eastus",
_58
"voice_name": "en-US-AndrewMultilingualNeural"
_58
}
_58
},
_58
"asr": {
_58
"language": "en-US"
_58
}
_58
}
_58
}'

If the request succeeds, the API returns a 200 status code and a response body like the following:


_5
{
_5
"agent_id": "1NT29X10YHxxxxxWJOXLYHNYB",
_5
"create_ts": 1737111452,
_5
"status": "RUNNING"
_5
}

Manual interruption

Conversational AI Engine supports actively triggering an interruption by calling RESTful APIs or client component APIs. This allows users to interrupt the agent through a button click or a specific command.

Call the RESTful API

Use the Interrupt agent API to manually initiate an interruption request.


_4
curl --request post \
_4
--url https://api.agora.io/api/conversational-ai-agent/v2/projects/:appid/agents/:agentId/interrupt \
_4
--header 'Authorization: Basic <credentials>' \
_4
--data '{}'

If the request is successful, the API returns a 200 status code and the following response body:


_5
{
_5
"agent_id": "1NT29XxxxxxxxxELWEHC8OS",
_5
"channel": "test_channel",
_5
"start_ts": 1744877089
_5
}

Call the client toolkit API

Agora provides a set of flexible, scalable and standardized client components for its conversational AI engine. These components support iOS, Android, and Web platforms and encapsulate scenario-based APIs. You can use them to integrate Agora Real-Time Communication (RTC) and Real-Time Messaging Signaling capabilities, enabling the following features:

  • Interrupt the agent
  • Display real-time transcript
  • Receive event notifications
  • Optimize audio (Android and iOS only)

Before you begin, make sure you:

  • Integrate Video SDK v4.5.1 or later and follow the Quickstart guide to implement basic real-time audio and video features.
  • Enable the Signaling service for your project in the Agora Console and follow the Signaling Quickstart to implement real-time messaging.
  • Implement the basic logic to communicate with a Conversational AI agent.
  • Ensure that the RTC engine instance is initialized and Signaling is logged in. The toolkit does not handle initialization, lifecycle management, authentication, or login for Video SDK or Signaling.

Integrate the toolkit

Copy the convoaiApi folder to your project and import it before calling API methods. Refer to the component structure to understand the role of each file.

Initialize the component

Create a configuration object for the RTC engine and Signaling client instances, then use it to initialize the component instance.

// Create a configuration object for the RTC and RTM instances
val config = ConversationalAIAPIConfig(
rtcEngine = rtcEngineInstance,
rtmClient = rtmClientInstance,
enableLog = true
)

// Create the component instance
val api = ConversationalAIAPIImpl(config)

Configure the conversational AI agent

Call Start a conversational AI agent using the following parameter settings:

  • advanced_features.enable_rtm: true: Start the Signaling service (Required)
  • parameters.data_channel: "rtm": Enable the RTM data transmission channel (Required)
  • parameters.enable_metrics: true: Receive agent performance data (Enabled on demand)
  • parameters.enable_error_message: true: Receive agent error events (Enable on demand)

After the call is successful, the agent joins the specified RTC channel and the user can start interacting with the agent.

Interrupt the agent

Call the interrupt method to interrupt the agent.

api.interrupt("agentId") { error -> /* ... */ }

Destroy the component

When the agent interaction ends, destroy the component instance to release all resources.

api.destroy()

Reference

Sample project

Agora provides a sample project for your reference. Download or view the source code for a complete example.

Component structure

The structure of the client component folder and the functions of each file are as follows:

info

Copy only the following files and folders to integrate the client component. You do not need to copy other files.

  • IConversationalAIAPI.kt: API interface and related data structures and enumerations
  • ConversationalAIAPIImpl.kt: ConversationalAI API main implementation logic
  • ConversationalAIUtils.kt: Tool functions and event callback management subRender/
    • v3/: Transcript module
      • TranscriptionController.kt: Transcript controller
      • MessageParser.kt: Message parser

API reference

RESTful API

Toolkit