Make Alexa speak any language with a little help of Home Assistant

Michał Konieczny
5 min readOct 23, 2020

--

When you are not lucky enough to have native support of your language in Alexa service you may find this interesting.

Well, it won’t be perfect solution because it has many limitations but

  • if you are looking for a way to make Alexa report current state of an entity eg. in your automation
  • or just want to use Echo as TTS speaker for fun.

Just follow these fairly simple steps. :)

Tl;dr;

Google TTS service running on AppDaemon, connected with custom Alexa skill.

What is needed?

Alexa Media Player

Make sure to have installed and configured Alexa Media Player.

This is a custom component to allow control of Amazon Alexa devices in Homeassistant using the unofficial Alexa API.

Detailed instalation instructions can be found here on Alexa Media Player GitHub.

AppDaemon

Then you’ll need AppDaemon instance.

AppDaemon is a loosely coupled, multithreaded, sandboxed python execution environment for writing automation apps for Home Assistant home automation software.

The easiest way is to install it through Add-on store.
Just go to Supervisor -> Add-on store -> search for AppDaemon 4 and install it.
Then proceed to Configuration tab and fill the config accordingly:

AppDaemon configuration page

This will install all external dependencies for our AppDaemon instance.

All set! Let’s write some code!

Python script on AppDaemon

First create tts directory in /config/www path. In this location we will store audio files generated by TTS service. It is important to keep them in www because that directory is publicly available on the URL https://abcdefghijklmnoprstuwxyz0987654321.ui.nabu.casa/local

WARNING!!!
I’m using Nabu Casa HA Cloud because it’s fairly cheap and provides Amazon certified HTTPS. You can use other way to make your HA instance available on the internet but make sure to use SSL certificate signed by Amazon-approved authority.

The MP3 files you use to provide audio must be hosted on an endpoint that uses HTTPS. The endpoint must provide an SSL certificate signed by an Amazon-approved certificate authority. Many content hosting services provide this.

Then go to /config/appdaemon/apps and create a file alexa_google_tts.py with following content:

See the comments on the listing above, just in case

Script uses gTTS package, you can read more here.

The language setting is located in line 27. It is hardcoded for simplicity because my use case doesn’t require dynamic change of the language, however it can be stored as another text_input or text_select entity.

Sound encoder forces 16k sample rate (line 34) as a safe fallback for older devices. Do some testing to find out if your device is capable to play 24k and alter that setting according to the research result. One of my 3 Echo dots gen. 3 failed so I decided to go with 16k for all of them.

Here’s more information about audio encoding for Skills purposes on SSML markup reference.

Output files are cached for re-usage. If exactly same text (md5 hash based on text input) is typed as input TTS service is not being called again to save the request quota and bandwidth.

Next, edit /config/appdaemon/apps/apps.yaml to look like this:

You should be able to access AddDaemon admin dashboard on port 5050 and see something like this:

Done! Let’s go to the next step.

Alexa Custom Skill

Go to Amazon Alexa console to create new skill. If you don’t have an account- register one, it’s free.

Type desired skill name… but it doesn’t really matter. Keep default language. Just ensure options Custom model and Alexa-Hosted (Python) are selected. Alter hosting region (top-right corner) according to your location to get better responsiveness of the service.

On the second step select Fact Skill template.

Creating your skill will take a minute or so. Finally go to newly created skill details page then proceed to Code tab and replace lambda_function.py with the code below.

I adopted lambda code from the other use case here: Play a local mp3 file on Alexa Echo dot.

Press Save, then Deploy. After some time your new custom Skill will be announced ready to use. Copy Skill ID from current web page URL, you’ll spot it for sure, or go back to the dashboard and click Copy Skill ID form Alexa Skills listing. Save it somewhere, we’ll need it soon.

Home Assistant configuration

Now, we gotta create all the HA entities required for our service to run. You may have already notice two text_input references in apps.yaml file. The first is the main text input for speech service. The second, temporary output used to trigger Alexa skill execution on change. Additionally select_input with all available Echos listed may be useful if you own more than one device.

Make sure to paste it into correct section of your configuration.yaml file otherwise it won’t work and may break the other things. Home Assistant doesn’t seem to have config sections merge mechanism so the last occurrence will replace previous ones.

Then create script. Edit scripts.yaml with:

This script basically will just trigger your custom skill using play_media service with selected Echo device. You can replace entity_id template with simple string name of Echo entity if you wanna use it on single Echo instance.
Remember to replace media_content_id with your Skill ID!

Lovelace UI elements

Sample use

Selected Echo device will say whatever you type in alexa google tts text input.
Since our logic triggers on text_input change it can be started from any other place and not necessarily manually. Eg. you can implement a standard HA script to set entity state that will trigger our TTS or use it in automation and so on.

To mimic result above just create new tab, add entity list with those inputs and mark it as panel mode if you wish.

I hope you find this article helpful!

--

--

Michał Konieczny
Michał Konieczny

Written by Michał Konieczny

Software engineer, maker, hacker

Responses (2)