Text to Voice with Ruby

Generate spoken audio from Ruby using the Narakeet REST API. The example below uses only net/http from the standard library — no gems to install. It runs with any Ruby version that ships net/http, which is every version in practical use today.

For endpoint details, authentication, and features shared across languages, see the main Text to Speech API reference.

Ruby Text to Speech Example
Ruby Text to Voice
Rails Text to Speech
Ruby TTS Without External Gems
Ruby Text to Speech API Options

Ruby Text to Speech Example

The following script sends text to the Narakeet API and saves the audio response as an MP3 file. The text_to_speech method accepts a Net::HTTP instance, keeping connection management separate from the API call itself.

require 'net/http'
require 'uri'

def text_to_speech(http, api_key, voice, text, output_path)
  uri = URI("https://api.narakeet.com/text-to-speech/mp3?voice=#{voice}")

  request = Net::HTTP::Post.new(uri)
  request['Accept'] = 'application/octet-stream'
  request['Content-Type'] = 'text/plain'
  request['x-api-key'] = api_key
  request.body = text

  response = http.request(request)

  if response.code != '200'
    abort "API error #{response.code}: #{response.body}"
  end

  File.binwrite(output_path, response.body)
end

api_key = ENV['NARAKEET_API_KEY'] || ''
if api_key.empty?
  puts 'Please set NARAKEET_API_KEY environment variable'
  exit 1
end

uri = URI('https://api.narakeet.com')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.open_timeout = 30
http.read_timeout = 30

text_to_speech(http, api_key, 'hannah', 'Hi there from Ruby', 'output.mp3')

puts 'File saved at: output.mp3'

Save this as tts.rb and run with ruby tts.rb. Export your API key as NARAKEET_API_KEY before running. Keys are managed from the API Keys dashboard.

For the complete project, see the Ruby streaming API example on GitHub.

Ruby Text to Voice

Ruby’s expressive syntax and rich standard library make text to voice integration concise. The net/http module handles HTTPS natively, so the entire integration fits in a single method with no external dependencies. Where Ruby text to voice works well in practice:

Rake tasks that batch-convert text files into audio during a build step
Background jobs in Sidekiq or Resque that generate audio asynchronously
CLI scripts that produce spoken versions of reports or notifications
DevOps automation that narrates deployment summaries or alert messages
Sinatra or Hanami endpoints that return audio on demand

For input longer than 1 KB or uncompressed WAV output, switch to the Long Content (Polling) API.

Rails Text to Speech

If you are building a Rails application, the code above works inside controllers, service objects, or Active Job workers. Using a service object for Rails text to speech keeps the HTTP logic out of your controllers and makes it easy to test with dependency injection. Wrap the text_to_speech method in a plain Ruby class, inject the Net::HTTP instance through the constructor, and call it from a background job to avoid blocking request threads.

For production Rails deployments, generate audio in an Active Job worker and store the result in Active Storage or upload it to S3. This keeps response times fast and lets you retry failed API calls automatically through your job backend.

Ruby TTS Without External Gems

A common question is whether a dedicated Ruby TTS gem exists. With Narakeet, the standard library covers everything — net/http for HTTPS requests, uri for URL construction, and File.binwrite for saving binary data. There is no gem to install, no native extension to compile, and no C library to link. The API accepts plain text and returns audio bytes over HTTPS, which net/http handles out of the box. This makes it straightforward to add text to speech to any Ruby project without expanding the dependency tree.

Ruby Text to Speech API Options

Control the output by adding query string parameters to the endpoint URL:

voice — Pick from 900 options across 100 languages. Example: ?voice=hannah. Browse all choices at Text to Speech Voices.
voice-speed — A multiplier for reading pace. 1.2 is 20% faster; 0.8 is 20% slower. Example: &voice-speed=1.2.
voice-volume — Accepts soft, medium, or loud. Example: &voice-volume=soft.
format — Change the path segment in the URL: /text-to-speech/mp3 for compressed audio, /text-to-speech/m4a for a good balance of size and quality, or /text-to-speech/wav for uncompressed PCM (requires the polling API).

For pitch control, sentence pauses, and multi-voice scripts, use the script header format inside the request body. Full details at Configuring Audio Tasks.