Text to Speech

The Text to Speech (TTS) API is powered by our አሌፍ-Audio engine. It is designed to produce human-quality speech for Ethiopian languages, correctly handling Ge'ez punctuation (like ።, ፣) for natural pausing and intonation.

The API accepts text and returns a Base64 encoded audio string (WAV format). This allows you to easily embed audio in web apps or save it to disk.

Basic Synthesis

Generate audio from a simple text string.

curl -X POST https://api.addisassistant.com/api/v1/audio \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk_YOUR_KEY" \
  -d '{
    "text": "ሰላም፣ እንኳን ወደ አዲስ ኤአይ በደህና መጡ።",
    "language": "am"
  }'

const fs = require('fs');

const response = await fetch("https://api.addisassistant.com/api/v1/audio", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "X-API-Key": "sk_YOUR_KEY",
  },
  body: JSON.stringify({
    text: "ሰላም፣ እንኳን ወደ አዲስ ኤአይ በደህና መጡ።", // "Hello, welcome to Addis AI."
    language: "am",
    voice_id: "male_1" // Optional
  }),
});

const data = await response.json();

// Decode Base64 and save to file
const audioBuffer = Buffer.from(data.audio, 'base64');
fs.writeFileSync('output.wav', audioBuffer);
console.log("Audio saved to output.wav");

import requests
import base64

url = "https://api.addisassistant.com/api/v1/audio"
headers = {"X-API-Key": "sk_YOUR_KEY"}

payload = {
    "text": "ሰላም፣ እንኳን ወደ አዲስ ኤአይ በደህና መጡ።",
    "language": "am",
    "voice_id": "male_1"
}

response = requests.post(url, headers=headers, json=payload)
data = response.json()

# Decode and save
audio_bytes = base64.b64decode(data['audio'])
with open('output.wav', 'wb') as f:
    f.write(audio_bytes)
    
print("Audio saved to output.wav")

Streaming (Long Text)

Generating audio for long paragraphs takes time. To minimize the wait (latency), use Streaming.

The API will send chunks of audio immediately as they are generated, allowing your app to start playing audio within milliseconds, even if the text is very long.

Usage: Set "stream": true in your request body.

Best for: testing it in postman. This script creates a queue to play chunks smoothly in order.

   curl -X POST https://api.addisassistant.com/api/v1/audio \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk_YOUR_KEY" \
  -d '{
    "text": "ሰላም፣ እንኳን ወደ አዲስ ኤአይ በደህና መጡ።",
    "language": "am",
    "stream":true
  }'

Best for: Chatbots and Web Apps. This script creates a queue to play chunks smoothly in order.

async function streamAudio() {
  const audioQueue = [];
  let isPlaying = false;

  try {
    const response = await fetch("https://api.addisassistant.com/api/v1/audio", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "X-API-Key": "sk_YOUR_KEY",
      },
      body: JSON.stringify({
        text: "ይህ ረጅም ጽሑፍ ነው። የአዲስ ኤአይ የድምጽ ቴክኖሎጂ ትልልቅ ጽሑፎችን በቀላሉ አንብቦ ድምጽ ሊያወጣ ይችላል።",
        language: "am",
        stream: true,
      }),
    });

    const reader = response.body.getReader();
    const decoder = new TextDecoder();

    // Play audio chunks sequentially
    function playNext() {
      if (audioQueue.length === 0) {
        isPlaying = false;
        return;
      }
      isPlaying = true;
      const nextChunk = audioQueue.shift();
      const audio = new Audio("data:audio/wav;base64," + nextChunk);
      audio.onended = playNext; 
      audio.play().catch(e => console.error("Playback failed:", e));
    }

    // Read the stream
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value, { stream: true });
      const lines = chunk.split("\n").filter((line) => line.trim());

      for (const line of lines) {
        try {
          const data = JSON.parse(line);
          if (data.audio_chunk) {
            audioQueue.push(data.audio_chunk);
            if (!isPlaying) playNext();
          }
        } catch (e) {
          console.error("JSON Parse Error:", e);
        }
      }
    }
  } catch (error) {
    console.error("Stream connection failed:", error);
  }
}

streamAudio();

Best for: Saving long files efficiently or piping to other services.

const fs = require("fs");

async function streamAndSave() {
  const response = await fetch("https://api.addisassistant.com/api/v1/audio", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "X-API-Key": "sk_YOUR_KEY",
    },
    body: JSON.stringify({
      text: "ይህ ረጅም ጽሑፍ ነው።",
      language: "am",
      stream: true,
    }),
  });

  const fileStream = fs.createWriteStream("output_stream.wav");

  for await (const chunk of response.body) {
    const textChunk = new TextDecoder().decode(chunk);
    const lines = textChunk.split("\n").filter(line => line.trim());

    for (const line of lines) {
      try {
        const json = JSON.parse(line);
        if (json.audio_chunk) {
          // Convert Base64 chunk to binary and write
          const binaryData = Buffer.from(json.audio_chunk, "base64");
          fileStream.write(binaryData);
        }
      } catch (e) {
        console.error("Chunk Error");
      }
    }
  }
  
  fileStream.end();
  console.log("Stream complete.");
}

streamAndSave();

Best for: AI Pipelines.

import requests
import json
import base64

url = "https://api.addisassistant.com/api/v1/audio"
headers = {"X-API-Key": "sk_YOUR_KEY"}
payload = {
    "text": "ይህ ረጅም ጽሑፍ ነው።",
    "language": "am",
    "stream": True
}

with open("stream_output.wav", "wb") as f:
    # stream=True keeps connection open
    with requests.post(url, json=payload, headers=headers, stream=True) as r:
        for line in r.iter_lines():
            if line:
                try:
                    data = json.loads(line)
                    if "audio_chunk" in data:
                        chunk = base64.b64decode(data['audio_chunk'])
                        f.write(chunk)
                except ValueError:
                    continue
                    
print("Stream saved.")

API Reference

Request Parameters

These parameters go in the root of your JSON body.

Prop

Type

Response Schema (Basic Mode)

If stream: false, you receive a single JSON object.

{
  "audio": "//NIxAAAAANIAAAAAExBTUVVVV..."
}

Prop

Type

Response Schema (Streaming Mode)

If stream: true, you receive newline-delimited JSON objects.

{"audio_chunk": "//NIxAAAAANIA...", "index": 0}
{"audio_chunk": "AAAEkSRJ...", "index": 1}

Handling the Audio Output

The API returns audio as a Base64 encoded string. This makes it easy to send JSON data, but you cannot play it directly without decoding it.

Audio Tool

Use this tool to test your API output. Paste the audio string from your response to hear it immediately.

Base64 Audio Player

Paste the audio string from your API response here to verify it.

Code Implementation

Here is how to decode and save the audio file programmatically in your application.

Choose your environment:

Universal Command (Windows/Mac/Linux)

You can decode and save the file in one command line using Python (pre-installed on most systems).

curl -X POST https://api.addisassistant.com/api/v1/audio \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk_YOUR_KEY" \
  -d '{"text": "ሰላም", "language": "am"}' \
  | python3 -c "import sys, json, base64; open('output.wav', 'wb').write(base64.b64decode(json.load(sys.stdin)['audio']))"

To save the file to disk in a backend environment:

import fs from 'fs';

// Assuming 'data' is the JSON response from fetch()
const base64String = data.audio;
const buffer = Buffer.from(base64String, 'base64');

fs.writeFileSync('speech.wav', buffer);
console.log('Saved to speech.wav');

To decode and save the file in a Python script:

import base64

# Assuming 'response_json' is the dict from requests.post().json()
base64_string = response_json['audio']
audio_data = base64.b64decode(base64_string)

with open("speech.wav", "wb") as f:
    f.write(audio_data)
    
print("Saved to speech.wav")

To play the audio immediately in a React or Vue app:

// Assuming 'data' is the JSON response from the API
const base64String = data.audio;

// Create a playable Audio object directly
const audio = new Audio("data:audio/wav;base64," + base64String);
audio.play();

Best Practices

Ensure high-quality voice output with these guidelines.

Script & Punctuation

Use Punctuation: The model relies on commas (፣) and periods (።) to determine pauses. Text without punctuation will sound rushed.

Avoid Mixed Scripts: Mixing English words inside an Amharic sentence may result in unnatural pronunciation. Transliterate English terms into Fidel if possible.

Latency Optimization

Long DocumentsUse stream: true

There is no hard character limit, but generating a large file takes time. For texts longer than 2 sentences, always use Streaming to ensure immediate playback.

Efficiency

Cache Everything: TTS is deterministic. If the input text hasn't changed, serve the saved audio file instead of calling the API again to save money and bandwidth.

Output Format

Base64 WAV: The API returns a Base64 string inside JSON. You must decode this string to get the playable WAV file.

Compression: Since WAV is large and uncompressed, consider converting the decoded audio to MP3 on your backend if your users are on mobile data.