Modify pronunciation using IPA and X-SAMPA alphabets (experimental)

You can now control pronunciation of words with IPA and X-SAMPA phonetic alphabets with many voices, to resolve ambiguity, set stress on a specific syllable or ensure that invented and rare words get read correctly.

Play the video below for a quick demo.

Using IPA and X-SAMPA

You can use ipa and x-sampa properties with a narration span to specify that the enclosed word or phase should not be interpreted as normal text, but as a phoneme written in IPA or X-SAMPA alphabets. See below for some interesting examples.

The support for phonetic alphabets is highly experimental - this means that it’s very important to test how the chosen voice reads out phonemes. Individual voices may support or not support some phonemes, and different voices support different phonetic alphabets.

We strongly suggest that commercial users try out specific combinations of phonemes and voices using the preview feature - this will allow you to try things out without spending any credits or getting billed for experiments.

Switch an accent

Text to speech voices are usually trained for a single accent. For example, we have a range of British English and American English voices. You can use IPA phonemes to change how a word is read out, to temporarily switch an accent. Here is a snippet that makes Victoria (a British voice) read out the word “tomato” as an American.

(voice: victoria)

Americans say [təˈmeɪ.toʊ]{ipa} instead of tomato.

Resolve ambiguity

Neural network text to speech voices, such as the ones Narakeet uses, try to guess the context around a word to decide how to pronounce it. Sometimes, if there is not enough context and the word is ambiguous, a voice can choose a different pronunciation from the one you would like. You can use IPA symbols to specify how to pronounce a word. For example, this snippet shows two ways to pronounce the word “delegate”:

(voice: victoria)

[ˈdɛlɪˌɡeɪt]{ipa} is a verb

[ˈdɛləɡət]{ipa} is a noun

Change the stress

You can use IPA stress marks to change the emphasised syllable inside a word. This is useful to create vocal inflections, to resolve ambiguity for words that are spelled the same but change meaning depending on the stressed syllable or to just add a funny accent. Here is an example of the same voice reading the word “question” but emphasising different syllables:

(voice: victoria)



Fix pronunciation for invented/unusual words

Text to speech voices provided by Narakeet are trained on a large set of data, and know how to pronounce most dictionary words, but just as a human would struggle with something unusual, AI voices may sometimes not read invented words or place names around the world correctly. Phonetic alphabets are a useful tool in this case, as they can force the voice to read a word the way you expect.

More information

For more information, see the following pages:

Narakeet helps you create narrated videos quickly, using text-to-speech to turn Powerpoint presentations and Markdown scripts into engaging videos. It is under active development, so things change frequently. Keep up to date: RSS, Slack, Twitter, YouTube, Facebook, Instagram, TikTok