december 12, 2016
Text to Speech with Polly and Voice RSS
Our Karotz lost its voice on the 8th of December, but we found new voices!
Maybe you haven't noticed, but Karotz lost its voice on the 8th of December. You might have read our previous blogpost with the title "How your rabbit speaks (using Text To Speech)". I wrote it more than a year ago, to explain the way Karotz speaks. It cannot speak by itself, but needs a cloud service to translate text into speech.
Until the 8th of December, Karotz used Text To Speech (TTS) from the Acapela group. On this day, we heard a voice telling us that this wasn't allowed anymore. So we needed an alternative.
Some of the apps in the TimeButton Appstore also use TTS. We changed these apps to use a brand new service from Amazon called "Amazon Polly". Polly is an Amazon AI (Artificial intelligence) service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Requests to Amazon Polly are managed server-side, so our webserver makes the request, and streams the audio back to your Karotz.
This is the flow from the beginning of a request until your Karotz starts talking (see also the image below):
- Karotz asks the Free Rabbits server: "please send me content for app 3013".
- The server will look up app 3013, and wil notice that it is the "Temperature App", telling the temperature for the next 5 days, using TTS.
- The server asks the weather service for the temperature for the next 5 days for a specific location.
- The weather service responds in sending the information in XML format to the server.
- The server creates a readable text, and asks Polly to translate this text into speech.
- Polly responds with an MP3 audio response.
- The server creates a unique URL and sends it to Karotz, telling Karotz to play this URL using MPlayer. MPlayer is the media player that is running on your Karotz to play media from the web.
- Karotz uses MPlayer to stream the MP3 and plays it.
- You'll hear the temperature for the next 5 days.
This works fine, although the speech is a little quiet. We try to fix this later, if it can be fixed.
Apps in the TimeButton Appstore are working now. But what about OpenKarotz? Will it ever speak again?
We also found a workaround for this problem:
We created a Python script you can use in combination with Voice RSS. Voice RSS is also a Text To Speech service, offering a free plan for a maximum of 350 request per day. On VoiceRss.org, you can create a free account. It will give you a personal API key. You will need this key together with our script to let your Karotz talk using OpenKarotz. Follow these steps:
- Create an account on http://www.voicerss.org/
- Save your Api key.
- Download http://www.freerabbits.nl/downloads/karotz/tts.zip or download the source from Github.
- UnZIP the file.
- Edit it with a textfile editor and replace "your-key-here" with your Api key (line 9, don't remove the quotes)
- Save the file, don't forget to use Unix line-endings.
- Upload it to your Karotz to /www/cgi-bin and replace the original file. You can also remove tts.inc now.
- Test TTS using the webinterface of OpenKarotz.
- If it doesn't work: enter commands "dos2unix /www/cgi-bin/tts" and "chmod 755 /www/cgi-bin/tts" and try again.
Although the quality is not as good as Polly, this might be a solution you can live with.
We hope this will help you to let your Karotz talk again. Let us know what you think!