Classes | Public Member Functions | Public Attributes | Static Public Attributes | Private Member Functions | List of all members
tts.synthesizer.SpeechSynthesizer Class Reference

Classes

class  BadEngineError
 
class  PollyDirect
 
class  PollyViaNode
 

Public Member Functions

def __init__ (self, engine='POLLY_SERVICE', polly_service_name='polly')
 
def start (self, node_name='synthesizer_node', service_name='synthesizer')
 

Public Attributes

 default_output_format
 
 default_text_type
 
 default_voice_id
 
 engine
 

Static Public Attributes

 ENGINES
 

Private Member Functions

def _call_engine (self, kw)
 
def _node_request_handler (self, request)
 
def _parse_request_or_raise (self, request)
 

Detailed Description

This class serves as a ROS service node that should be an entry point of a TTS task.

Although the current implementation uses Amazon Polly as the synthesis engine, it is not hard to let it support
more heterogeneous engines while keeping the API the same.

In order to support a variety of engines, the SynthesizerRequest was designed with flexibility in mind. It
has two fields: text and metadata. Both are strings. In most cases, a user can ignore the metadata and call
the service with some plain text. If the use case needs any control or engine-specific feature, the extra
information can be put into the JSON-form metadata. This class will use the information when calling the engine.

The decoupling of the synthesizer and the actual synthesis engine will benefit the users in many ways.

First, a user will be able to use a unified interface to do the TTS job and have the freedom to use different
engines available with no or very little change from the client side.

Second, by applying some design patterns, the synthesizer can choose an engine dynamically. For example, a user
may prefer to use Amazon Polly but is also OK with an offline solution when network is not reliable.

Third, engines can be complicated, thus difficult to use. As an example, Amazon Polly supports dozens of parameters
and is able to accomplish nontrivial synthesis jobs, but majority of the users never need those features. This
class provides a clean interface with two parameters only, so that it is much easier and pleasant to use. If by
any chance the advanced features are required, the user can always leverage the metadata field or even go to the
backend engine directly.

Also, from an engineering perspective, simple and decoupled modules are easier to maintain.

This class supports two modes of using polly. It can either call a service node or use AmazonPolly as a library.

Start the service node::

    $ rosrun tts synthesizer_node.py  # use default configuration
    $ rosrun tts synthesizer_node.py -e POLLY_LIBRARY  # will not call polly service node

Call the service::

    $ rosservice call /synthesizer 'hello' ''
    $ rosservice call /synthesizer '<speak>hello</speak>' '"{\"text_type\":\"ssml\"}"'

Definition at line 25 of file synthesizer.py.

Constructor & Destructor Documentation

def tts.synthesizer.SpeechSynthesizer.__init__ (   self,
  engine = 'POLLY_SERVICE',
  polly_service_name = 'polly' 
)

Definition at line 94 of file synthesizer.py.

Member Function Documentation

def tts.synthesizer.SpeechSynthesizer._call_engine (   self,
  kw 
)
private
Call engine to do the job.

If no output path is found from input, the audio file will be put into /tmp and the file name will have
a prefix of the md5 hash of the text.

:param kw: what AmazonPolly needs to synthesize
:return: response from AmazonPolly

Definition at line 106 of file synthesizer.py.

def tts.synthesizer.SpeechSynthesizer._node_request_handler (   self,
  request 
)
private
The callback function for processing service request.

It never raises. If anything unexpected happens, it will return a SynthesizerResponse with the exception.

:param request: an instance of SynthesizerRequest
:return: a SynthesizerResponse

Definition at line 139 of file synthesizer.py.

def tts.synthesizer.SpeechSynthesizer._parse_request_or_raise (   self,
  request 
)
private
It will raise if request is malformed.

:param request: an instance of SynthesizerRequest
:return: a dict

Definition at line 123 of file synthesizer.py.

def tts.synthesizer.SpeechSynthesizer.start (   self,
  node_name = 'synthesizer_node',
  service_name = 'synthesizer' 
)
The entry point of a ROS service node.

:param node_name: name of ROS node
:param service_name:  name of ROS service
:return: it doesn't return

Definition at line 156 of file synthesizer.py.

Member Data Documentation

tts.synthesizer.SpeechSynthesizer.default_output_format

Definition at line 104 of file synthesizer.py.

tts.synthesizer.SpeechSynthesizer.default_text_type

Definition at line 102 of file synthesizer.py.

tts.synthesizer.SpeechSynthesizer.default_voice_id

Definition at line 103 of file synthesizer.py.

tts.synthesizer.SpeechSynthesizer.engine

Definition at line 100 of file synthesizer.py.

tts.synthesizer.SpeechSynthesizer.ENGINES
static

Definition at line 86 of file synthesizer.py.


The documentation for this class was generated from the following file:


tts
Author(s): AWS RoboMaker
autogenerated on Fri Mar 5 2021 03:06:38