User:Sebastian Berlin (WMSE)/tmp

Integration with TTS-server


Documentation

The extension uses a service for TTS operations, such as creating audio for utterances. This TTS server consists of a main server, a lexicon server, TTS engines and any additional components that may be required for certain languages.

The extension sends requests to the server when an utterances is prepared to be read. The server generates the utterance as audio and responds with the URL to an audio file and other information. This is then used when the text is read to the user.

To prepare an utterance for reading, the extension sends a request to the server. This request contains the utterance as text the language which it is in. The server processes the text using one of the installed TTS engines, depending on what voice is chosen. Once the TTS engine has generated the audio, it responds to the request with a URL to the audio file along with some information that will enable highlighting and skipping.

Main Wikispeech Server
Repo

API for various operations. Includes endpoint for generating synthesized speech.

Pronlex
Repo

Lexicon server with its own API. Holds information about lexicon entries and allows manipulation of them. Used internally to look up pronunciation of the words to synthesize.

TTS engines
The server supports having multiple TTS engines. Which one is used depends on which voice is used to synthesize.

MaryTTS
Repo

Comes with support for Arabic, English and Swedish.

Mishkal
Repo

Used to vocalize Arabic text.