API Documentation

FishSpeech API 文档与调试工具

Text to Speech

Voice Cloning

Lip Sync Video

Other

Text to Speech (HTTP)

Convert text to speech using HTTP API

Text to Speech API

Endpoint

POST /api/open/tts

Request Headers

// JSON Format
Content-Type: application/json
Authorization: Bearer YOUR_API_TOKEN  // API Key

// MessagePack Format
Content-Type: application/msgpack
Authorization: Bearer YOUR_API_TOKEN  // API Key

Request Parameters

{
  "reference_id": string,  // Required,Voice Model ID
  "text": string,         // Required,Text to convert
  "speed": number,        // Optional,Speech speed,Range:0.5-2.0,Default:1
  "volume": number,       // Optional,Volume,Range:-20-20,Default:0
  "version": string,      // Optional,TTS Version。Available values:"v1"、"v2"、"s1"(Traditional version),"v3-turbo"、"v3-hd"(V3 version),Default:"v1"
  "format": string,       // Optional,Audio format,Available values:"mp3"、"wav"、"pcm",Default:"mp3"
  "emotion": string,      // Optional,Emotion control(V3 version only),Available values:"happy"、"sad"、"angry"、"fearful"、"disgusted"、"surprised"、"calm"、"fluent"、"auto",Default:"auto"
  "language": string,       // Optional,Language enhancement(V3 version only),Available values:"auto"、"zh"、"en"等,Default:"auto"
  "cache": boolean        // Optional,false returns audio binary stream,true caches and returns audio file URL,Default:false
}

Version Notes:

  • Legacy Versions: v1, v2, s1 (basic text-to-speech functionality)Legacy Versions: v1, v2, s1 (basic text-to-speech functionality)
  • V3 Versions: v3-turbo, v3-hd (advanced features including emotion control and language boost)V3 Versions: v3-turbo, v3-hd (advanced features including emotion control and language boost)
  • The system will automatically select the corresponding version based on model configuration, no manual specification needed

Response Data

// Success Response (cache=false) - 200
Content-Type: audio/mpeg
<Binary audio data>

// Success Response (cache=true) - 200
Content-Type: application/json
{
  "success": boolean,        // Whether successful
  "audio_url": string,       // Audio file URL
  "format": string,          // Audio format
  "characters_used": number, // Characters used
  "quota_remaining": number  // Remaining API credits
}

// Error Response
{
  "error": string     // Error message
}

CURL Example

# JSON Format - Traditional version (using s1 version, recommended)
curl -X POST https://fishspeech.net/api/open/tts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "reference_id": "your_model_id",
    "text": "Text content to convert",
    "speed": 1.0,
    "volume": 0,
    "version": "s1",
    "format": "mp3",
    "cache": false
  }' \
  --output output.mp3

# JSON Format - V3 model (using HD version, supports emotion control and language enhancement)
curl -X POST https://fishspeech.net/api/open/tts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "reference_id": "your_model_id",
    "text": "Text content to convert",
    "speed": 1.0,
    "volume": 0,
    "version": "v3-hd",
    "emotion": "calm",
    "language": "zh",
    "format": "mp3",
    "cache": false
  }' \
  --output output.mp3

# MessagePack Format (需要使用支持 MessagePack 的客户端库)

在线调试

Status Code Description

Status Code Description:
200 OK                  - Request successful
400 Bad Request         - Invalid request parameters
401 Unauthorized        - Invalid API Token
403 Forbidden          - Access forbidden
404 Not Found          - Resource not found
413 Payload Too Large  - Upload file too large
429 Too Many Requests  - Rate limit exceeded/Insufficient credits
500 Server Error       - Internal server error

Error Response Format:
{
  "error": string,      // Error message
  "details": string,    // Detailed error message (optional)
  "code": string       // Error code (optional)
}