Extension:Wikispeech

The Wikispeech project aims to create an open source text-to-speech tool to make Wikimedia's projects more accessible for people that have difficulties reading for different reasons. Wikispeech will be available as a MediaWiki extension. More information can be found on the project page; this page is just about the Wikispeech extension itself.

Speechoid


Documentation

The extension uses a service for TTS operations, such as creating audio for utterances called Speechoid. Speechoid consists of a main server, a lexicon server, TTS engines and any additional components that may be required for certain languages.

To prepare an utterance for playing, the extension sends a request to the service. This request contains the utterance as text, which language it is in and which voice to use. The service processes the text using a lexicon and one of the installed TTS engines, depending on what voice is being used. Once the audio has been generated, a response is returned with a URL to the audio file along with some information that will enable highlighting and skipping. This is then used by the extension to actually play the utterance to the user and the process is repeated for the following utterances as needed.

Main Wikispeech Server
Repo

The main server has a web API that includes an endpoint for generating speech. It handles internal communication between the underlying servers, listed below.

Pronlex
Repo

A lexicon server with its own API. Holds information about lexicon entries and has endpoints for lookup and manipulation of them. When processing an utterance, words are looked up in the lexicon and if there is a matching entry it is used for the pronunciation.

TTS engines
The server supports having multiple TTS engines. Which one is used for a certain utterance depends on which voice is given in the request.

MaryTTS
Repo

Comes with support for Arabic, English and Swedish.

Mishkal
Repo

Used to vocalize Arabic text.

Symbolset


Symbolset is a repository for handling [[w:Phonetic_transcription|phonetic symbol sets] and mappers/converters between different symbol sets and languages.

Installation

 * Download and place the file(s) in a directory called  in your   folder.
 * Run Composer to install PHP dependencies, by issuing  in the extension directory. (See T173141 for potential complications.)
 * Add the following code at the bottom of your : wfLoadExtension( 'Wikispeech' );
 * Done - Navigate to Special:Version on your wiki to verify that the extension is successfully installed.

Setting up Speechoid
The Wikispeech extension requires Speechoid to generate audio. The easiest way to install this is to use the Docker Compose version. The installation instructions can be found at https://github.com/stts-se/wikispeech_compose/tree/master/docker. For other ways to install the service, see http://stts-se.github.io/wikispeech/. Ports 80 and 10000 need to be opened for accessing audio files and sending requests to the service respectively. Detailed instructions for installing Speechoid on Cloud VPS can be found on /Speechoid on wmflabs.

Basic configuration
For the Wikispeech extension to be able to communicate with Speechoid, you need to specify the service's URL. You can do this by adding the following line to : where  is the URL to your Speechoid instance.

CSS
This is a subset the CSS rules that are most interesting for a non-developer.