Extension:Wikispeech/Installing Speechoid

Speechoid is the text-to-speech backend of Wikispeech. It consists of a number of services controlled via Wikispeech-server. The Speechoid build pipeline use Blubber to create Docker images which is ment to be deployed on for instance a Kubernetes cluster, but there is also a Compose project ment for running locally, on development servers and small installations.

Please report any errors you encounter in the discussion of this wikipage.

= Prerequisits =

This guide is based on the setup used by the Wikispeech development team that use Ubuntu (18, 19 and 20) as workstations and Debian (10) server side.

Docker
https://docs.docker.com/engine/install/

Docker compose
https://docs.docker.com/compose/install/

Blubber
https://wikitech.wikimedia.org/wiki/Blubber/Download We strongly recommend using the prebuilt binary.

Make sure it's available in your $PATH, e.g. by copying it to /usr/local/bin.

= Building services =

Using the build script
The docker-compose project comes with a script that will download and build all services for you.

git clone https://github.com/karlwettin/wikispeech-docker-compose.git cd wikispeech-docker-compose ./create-all-images.sh

Manually
Each service lives in its own git repository at gerrit.wikimedia.org:

git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/mary-tts" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/mishkal" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/pronlex" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/symbolset" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/wikispeech-server"

Each service contains a Blubber-helper script that prepare the image.

cd some-service ./blubber-build.sh

= Starting Speechoid =

Using compose
The easiest way is to spin it up using docker-compose:

git clone https://github.com/karlwettin/wikispeech-docker-compose.git cd wikispeech-docker-compose ./run.sh

After a little while Wikispeech server should be available as an HTTP service on port 10000.

Manually
As of writing this documentation, there is no ready to go pipeline for starting up in other environments. Speechoid will, when ready for production, be deployed on a Kubernets cluster. Expect documentation about that here.

= Setting up on WMF Cloud VPS =

Also see Instructions for preparing your Cloud VPS instance.

As of writing this documentation, there is a problem with the WMF Cloud VPS Debian 10 Buster images related to file system character encoding and aptitude. This cause problems when install java-ca-certificates package, resulting in not being able to download Gradle wrapper and dependencies for Mary TTS. You must therefore rather than building the docker images on the Cloud VPS instance do that on your local machine, extract (and compress) them, upload them to your Cloud VPS instance, and there install the uploaded image:

docker save wikispeech-mary-tts:latest | gzip > wikispeech-mary-tts:latest.tar.gz docker save wikispeech-mishkal:latest | gzip > wikispeech-mishkal:latest.tar.gz docker save wikispeech-pronlex:latest | gzip > wikispeech-pronlex:latest.tar.gz docker save wikispeech-symbolset:latest | gzip > wikispeech-symbolset:latest.tar.gz docker save wikispeech-server:latest | gzip > wikispeech-server:latest.tar.gz scp -o "ProxyJump you@primary.bastion.wmflabs.org" wikispeech-mary-tts:latest.tar.gz you@your-instance.wmflabs:~/ ... ssh -J you@primary.bastion.wmflabs.org you@your-instance.wmflabs docker load -i wikispeech-mary-tts:latest.tar.gz ... cd wikispeech-docker-compose ./run.sh

Setup web proxy
You'll now need to setup an HTTP proxy pass to Speechoid on port 10000. Alternatively you could setup a floating IP, but that isn't covered by this guide.

Go to https://horizon.wikimedia.org/project/proxy/ and click the Create Proxy-button. Fill in the information about your VPS and use backend port 10000. Speechoid should now be publicly available as HTTP and HTTPS on port 80 and 443.