Start a conversational AI agent

POST

https://api.agora.io/api/conversational-ai-agent/v2/projects/{appid}/join

Use this endpoint to create and start a Conversational AI agent instance.

Request

Path parameters

appid stringrequired

The App ID of the project.

Request body

APPLICATION/JSON

BODYrequired

name stringrequired
The unique identifier of the agent. The same identifier cannot be used repeatedly.
properties objectrequired
Configuration details of the agent.
tts objectrequired
Text-to-speech (TTS) module configuration.
llm objectrequired
Large language model (LLM) configuration.
vad objectnullable
Voice Activity Detection (VAD) configuration.

Response

If the returned status code is 200, the request was successful. The response body contains the result of the request.
OK
- agent_id string
  Unique id of the agent instance
- create_ts integer
  Timestamp of when the agent was created
- status string
  Possible values: IDLE, STARTING, RUNNING, STOPPING, STOPPED, RECOVERING, FAILED
  Current status.
  IDLE (0): Agent is idle.
  
  STARTING (1): The agent is being started.
  
  RUNNING (2): The agent is running.
  
  STOPPING (3): The agent is stopping.
  
  STOPPED (4): The agent has exited.
  
  RECOVERING (5): The agent is recovering.
  
  FAILED (6): The agent failed to execute.
If the returned status code is not 200, the request failed. The response body includes the detail and reason for failure. Refer to status codes to understand the possible reasons for failure.

Reference

TTS vendor configuration

Conversational AI Engine supports the following TTS vendors:

Microsoft

paramsrequired

key stringrequired
The API key used for authentication.
region stringrequired
The Azure region where the speech service is hosted.
voice_name string
The identifier for the selected voice for speech synthesis.
rate number
Indicates the speaking rate of the text. The rate can be applied at the word or sentence level and should be between 0.5 and 2.0 times the original audio speed.
volume number
Default: 100
Specifies the audio volume as a number between 0.0 and 100.0, where 0.0 is the quietest and 100.0 is the loudest. For example, a value of 75 sets the volume to 75% of the maximum.
sample_rate integer
Default: 24000
Specifies the audio sampling rate in Hz.

For further details, refer to Microsoft TTS.

Sample configuration

{
    "vendor": "microsoft",
    "params": {
        "key": "<your_microsoft_key>",
        "region": "eastus",
        "voice_name": "en-US-AndrewMultilingualNeural",
        "rate": 1.0,
        "volume": 70
    }
}

Elevenlabs

paramsrequired

key stringrequired
The API key used for authentication.
model_id stringrequired
Identifier of the model to be used,
voice_id stringrequired
The identifier for the selected voice for speech synthesis.
sample_rate integer
Default: 24000
Specifies the audio sampling rate in Hz.
stability number
The stability for voice settings.
similarity_boost number
style number
use_speaker_boost boolean

For further details, refer to Elevenlabs TTS.

Sample configuration

{
  "vendor": "elevenlabs",
  "params": {
    "key": "<your_elevenlabs_key>",
    "model_id": "eleven_flash_v2_5",
    "voice_id": "pNInz6obpgDQGcFmaJgB"
  }
}

Authorization

This endpoint requires Basic Auth.

Request example

curl
Python
Node.js

curl --request post \
--url https://api.agora.io/api/conversational-ai-agent/v2/projects/:appid/join \
--header 'Authorization: Basic <your_base64_encoded_credentials>' \
--data '
{
    "name": "unique_name",
    "properties": {
        "channel": "channel_name",
        "token": "token",
        "agent_rtc_uid": "friday",
        "remote_rtc_uids": [
            "*"
        ],
        "enable_string_uid": true,
        "idle_timeout": 120,
        "advanced_features": {
            "enable_aivad": true
        },
        "llm": {
            "url": "https://api.openai.com/v1/chat/completions",
            "api_key": "<your_llm_key>",
            "system_messages": [
                {
                    "role": "system",
                    "content": "You are a helpful chatbot."
                }
            ],
            "max_history": 32,
            "greeting_message": "Hello, how can I assist you today?",
            "failure_message": "Please hold on a second.",
            "params": {
                "model": "gpt-4o-mini"
            }
        },
        "tts": {
            "vendor": "microsoft",
            "params": {
                "key": "<your_tts_api_key>",
                "region": "eastus",
                "voice_name": "en-US-AndrewMultilingualNeural"
            }
        },
        "asr": {
            "language": "en-US"
        }
    }
}'

Response example

{
  "agent_id": "1NT29X10YHxxxxxWJOXLYHNYB",
  "create_ts": 1737111452,
  "status": "RUNNING"
}