Text to Voice with Ruby
Generate spoken audio from Ruby using the Narakeet REST API. The example below uses only net/http from the standard library — no gems to install. It runs with any Ruby version that ships net/http, which is every version in practical use today.
For endpoint details, authentication, and features shared across languages, see the main Text to Speech API reference.
- Ruby Text to Speech Example
- Ruby Text to Voice
- Rails Text to Speech
- Ruby TTS Without External Gems
- Ruby Text to Speech API Options
Ruby Text to Speech Example
The following script sends text to the Narakeet API and saves the audio response as an MP3 file. The text_to_speech method accepts a Net::HTTP instance, keeping connection management separate from the API call itself.
require 'net/http'
require 'uri'
def text_to_speech(http, api_key, voice, text, output_path)
uri = URI("https://api.narakeet.com/text-to-speech/mp3?voice=#{voice}")
request = Net::HTTP::Post.new(uri)
request['Accept'] = 'application/octet-stream'
request['Content-Type'] = 'text/plain'
request['x-api-key'] = api_key
request.body = text
response = http.request(request)
if response.code != '200'
abort "API error #{response.code}: #{response.body}"
end
File.binwrite(output_path, response.body)
end
api_key = ENV['NARAKEET_API_KEY'] || ''
if api_key.empty?
puts 'Please set NARAKEET_API_KEY environment variable'
exit 1
end
uri = URI('https://api.narakeet.com')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.open_timeout = 30
http.read_timeout = 30
text_to_speech(http, api_key, 'hannah', 'Hi there from Ruby', 'output.mp3')
puts 'File saved at: output.mp3'
Save this as tts.rb and run with ruby tts.rb. Export your API key as NARAKEET_API_KEY before running. Keys are managed from the API Keys dashboard.
For the complete project, see the Ruby streaming API example on GitHub.
Ruby Text to Voice
Ruby’s expressive syntax and rich standard library make text to voice integration concise. The net/http module handles HTTPS natively, so the entire integration fits in a single method with no external dependencies. Where Ruby text to voice works well in practice:
- Rake tasks that batch-convert text files into audio during a build step
- Background jobs in Sidekiq or Resque that generate audio asynchronously
- CLI scripts that produce spoken versions of reports or notifications
- DevOps automation that narrates deployment summaries or alert messages
- Sinatra or Hanami endpoints that return audio on demand
For input longer than 1 KB or uncompressed WAV output, switch to the Long Content (Polling) API.
Rails Text to Speech
If you are building a Rails application, the code above works inside controllers, service objects, or Active Job workers. Using a service object for Rails text to speech keeps the HTTP logic out of your controllers and makes it easy to test with dependency injection. Wrap the text_to_speech method in a plain Ruby class, inject the Net::HTTP instance through the constructor, and call it from a background job to avoid blocking request threads.
For production Rails deployments, generate audio in an Active Job worker and store the result in Active Storage or upload it to S3. This keeps response times fast and lets you retry failed API calls automatically through your job backend.
Ruby TTS Without External Gems
A common question is whether a dedicated Ruby TTS gem exists. With Narakeet, the standard library covers everything — net/http for HTTPS requests, uri for URL construction, and File.binwrite for saving binary data. There is no gem to install, no native extension to compile, and no C library to link. The API accepts plain text and returns audio bytes over HTTPS, which net/http handles out of the box. This makes it straightforward to add text to speech to any Ruby project without expanding the dependency tree.
Ruby Text to Speech API Options
Control the output by adding query string parameters to the endpoint URL:
- voice — Pick from 900 options across 100 languages. Example:
?voice=hannah. Browse all choices at Text to Speech Voices. - voice-speed — A multiplier for reading pace.
1.2is 20% faster;0.8is 20% slower. Example:&voice-speed=1.2. - voice-volume — Accepts
soft,medium, orloud. Example:&voice-volume=soft. - format — Change the path segment in the URL:
/text-to-speech/mp3for compressed audio,/text-to-speech/m4afor a good balance of size and quality, or/text-to-speech/wavfor uncompressed PCM (requires the polling API).
For pitch control, sentence pauses, and multi-voice scripts, use the script header format inside the request body. Full details at Configuring Audio Tasks.