Jump to content

Extension:Wikispeech/Installing Speechoid

From mediawiki.org
This page is a translated version of the page Extension:Wikispeech/Installing Speechoid and the translation is 7% complete.

Speechoid is the text-to-speech backend of Wikispeech. It consists of a number of services controlled via Wikispeech-server. The Speechoid build pipeline use Blubber to create Docker images which is meant to be deployed on for instance a Kubernetes cluster, but there is also a Compose project for easy deployment on development servers and small installations.

Please report any errors you encounter in the discussion of this wikipage.

Pre built images

All the Speechoid services are a part of the Wikimedia CI pipeline, and all builds are available in the Wikimedia docker registry. You probably don't need to build your own images.

Building images

要件

This guide is based on the setup used by the Wikispeech development team that use Ubuntu (18, 19 and 20) as workstations and Debian (10) server side.

  • Docker
  • Blubber - We strongly recommend using the prebuilt binary. Make sure it's available in your $PATH, e.g. by copying it to /usr/local/bin.

Using the build script

The docker-compose project comes with a script that will download and build all services for you.

git clone https://github.com/Wikimedia-Sverige/wikispeech-speechoid-docker-compose.git cd wikispeech-speechoid-docker-compose ./create-all-images.sh

Manually building images

Each service lives in its own git repository at gerrit.wikimedia.org:

git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/mary-tts" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/mishkal" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/pronlex" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/symbolset" git clone "https://gerrit.wikimedia.org/r/mediawiki/services/wikispeech/wikispeech-server"

Each service contains a Blubber-helper script that prepare the image.

cd some-service
./blubber-build.sh

Starting Speechoid

Using compose

The easiest way is to spin it up using Docker compose:

git clone https://github.com/Wikimedia-Sverige/wikispeech-speechoid-docker-compose cd wikispeech-speechoid-docker-compose docker compose up

If you built your own images you'll need up update the compose file to correspond with your image build tags.

After a little while Wikispeech server should be available as an HTTP service on port 10000.

Manually starting Speechoid

As of writing this documentation, there is no ready to go pipeline for starting up in other environments. Speechoid will, when ready for production, be deployed on a Kubernetes cluster.

Setting up on WMF Cloud VPS

These instructions are for Debian 10 on WMF cloud VPS services, but should be rather generic and reusable for most environments.

Setup as systemd service

The following will automatically start Speechoid on boot. Change WorkingDirectory to match your MediaWiki path.

# sudo su
# cat << EOF > /etc/systemd/system/speechoid.service
[Unit]
Description=Speechoid service
Requires=docker.service network-online.target
After=docker.service network-online.target

[Service]
WorkingDirectory=/var/www/html/w/extensions/Wikispeech/dev/speechoid-docker-compose
Type=simple
TimeoutStartSec=15min
Restart=always
ExecStart=/usr/bin/docker compose up
ExecStop=/usr/bin/docker compose down

[Install]
WantedBy=multi-user.target
EOF

# systemctl enable speechoid
# systemctl start speechoid

Setup web proxy

This is only need if you install MediaWiki and Speechoid on separate servers.

You'll now need to setup an HTTP proxy pass to Speechoid on port 10000. Alternatively you could setup a floating IP, but that isn't covered by this guide.

Go to https://horizon.wikimedia.org/project/proxy/ and click the Create Proxy-button. Fill in the information about your VPS and use backend port 10000. Speechoid should now be publicly available as HTTP and HTTPS on port 80 and 443.

To enable transcription preview on Special:EditLexicon, the Symbolset server also needs to be accessible. This is done as above, but the port is 8771.

Storing Docker data on a volume

Newer Docker images for Speechoid take up more space and may not fit on the standard instance hard drive. Instead you can add a volume to store some of the Docker data on.

Follow mw:Help:Adding disk space to Cloud VPS instances to add and attach a volume. During tests 15 GB was too little space, but 20 seems to be enough. This could change in the future.

Next, follow the instructions for Configure the data directory location. The important thing is the containerd path which takes up most space. You can skip the thing about data-root. Then restart the service with sudo systemctl restart containerd.service. You may get an error after restarting if you include version = 2 from the instructions. If so remove that line.