Text to Speech in Python

Generate spoken audio from Python with the Narakeet REST API. There is no dedicated Python text to speech module to install — the standard requests package is enough to call the API and save the result. This page includes a ready-to-run example, explains common options, and covers the library and module questions that come up most often.

The Text to Speech API reference covers authentication, endpoints, and advanced features shared across all languages.

Python Text to Speech

Here is the fastest way to add Python text to speech to a script. The code below sends a sentence to the API and writes the resulting audio to disk. Unlike local Python text to speech tools that need system-level voice engines or GPU resources, this runs anywhere Python runs — laptops, servers, containers, or serverless functions.

import os
api_key = os.environ['NARAKEET_API_KEY']
voice = 'mickey'
text = 'Hi there from Python'
url = f'https://api.narakeet.com/text-to-speech/m4a?voice={voice}'

import requests

options = {
    'headers': {
        'Accept': 'application/octet-stream',
        'Content-Type': 'text/plain',
        'x-api-key': api_key,
    },
    'data': text.encode('utf8')
}

response = requests.post(url, **options)
response.raise_for_status()
with open('output.m4a', 'wb') as f:
  f.write(response.content)

Export your API key as NARAKEET_API_KEY before running. Keys are managed from the API Keys dashboard. The full project is on GitHub: text-to-speech-api-python-example.

Text to Speech Using Python

The example relies on the requests library (pip install requests). That single dependency is all you need — no specialised text to speech module, no compiled extension. Build the URL, set three headers, post the text, and write the bytes. Text to speech using Python really is that short.

Where text to speech using Python proves especially useful:

  • Data pipelines that produce daily audio summaries from analytics reports
  • Batch scripts that walk a directory of Markdown files and output one MP3 per file
  • Django or Flask endpoints that return audio on the fly for accessibility features
  • Jupyter notebooks that let researchers listen to generated text during experiments
  • AWS Lambda or Google Cloud Functions that respond to events with spoken alerts

For content exceeding 1 KB, or uncompressed WAV files, the Long Content (Polling) API handles larger jobs. A full Python polling example lives at https://github.com/narakeet/text-to-speech-polling-api-python-example.

Python Text to Speech Library

People often search for a Python text to speech library expecting something to pip install. With Narakeet, the requests package already in most projects is the only dependency. The API accepts plain text over HTTPS and returns audio bytes — no wrapper library needed.

If you prefer zero external dependencies, Python’s built-in urllib.request works too. Any HTTP client that can set custom headers and read a binary response is sufficient.

Python Text to Speech Module

There is no separate Python text to speech module to configure. Traditional Python text to speech modules like pyttsx3 or gTTS wrap local engines with limited voice quality and language coverage. The Narakeet API replaces them with 900 AI voices in 100 languages, accessed through a single HTTP call rather than a platform-specific module.

Python Text to Speech API Options

Control the output by adding query string parameters to the endpoint URL:

  • voice — Pick from 900 options. Example: ?voice=mickey. Browse all choices at Text to Speech Voices.
  • voice-speed — A multiplier for reading pace. 1.2 is 20% faster; 0.8 is 20% slower.
  • voice-volume — Accepts soft, medium, or loud.
  • endpoint/m4a (default in the example), /mp3, or /wav. WAV requires the polling API.

For pitch control, multi-voice scripts, and other advanced settings, use the script header format inside the request body. Full details at Configuring Audio Tasks.