Public Member Functions | Public Attributes | Private Member Functions | List of all members
tts.amazonpolly.AmazonPolly Class Reference

Public Member Functions

def __init__ (self, aws_access_key_id=None, aws_secret_access_key=None, aws_session_token=None, region_name=None)
 
def start (self, node_name='polly_node', service_name='polly')
 
def synthesize (self, kws)
 

Public Attributes

 default_output_file_basename
 
 default_output_folder
 
 default_output_format
 
 default_text_type
 
 default_voice_id
 
 polly
 

Private Member Functions

def _dispatch (self, request)
 
def _generate_user_agent_suffix (self)
 
def _get_polly_client (self, aws_access_key_id=None, aws_secret_access_key=None, aws_session_token=None, region_name=None, with_service_model_patch=False)
 
def _make_audio_file_fullpath (self, output_path, output_format)
 
def _node_request_handler (self, request)
 
def _pcm2wav (self, audio_data, wav_filename, sample_rate)
 
def _synthesize_speech_and_save (self, request)
 

Detailed Description

A TTS engine that can be used in two different ways.

Usage
-----

1. It can run as a ROS service node.

Start a polly node::

    $ rosrun tts polly_node.py

Call the service from command line::

    $ rosservice call /polly SynthesizeSpeech 'hello polly' '' '' '' '' '' '' '' '' [] [] 0 '' '' '' '' '' '' false

Call the service programmatically::

    from tts.srv import Polly
    rospy.wait_for_service('polly')
    polly = rospy.ServiceProxy('polly', Polly)
    res = polly(**kw)

2. It can also be used as a normal python class::

    AmazonPolly().synthesize(text='hi polly')

PollyRequest supports many parameters, but the majority of the users can safely ignore most of them and just
use the vanilla version which involves only one argument, ``text``.

If in some use cases more control is needed, SSML will come handy. Example::

    AmazonPolly().synthesize(
        text='<speak>Mary has a <amazon:effect name="whispered">little lamb.</amazon:effect></speak>',
        text_type='ssml'
    )

A user can also control the voice, output format and so on. Example::

    AmazonPolly().synthesize(
        text='<speak>Mary has a <amazon:effect name="whispered">little lamb.</amazon:effect></speak>',
        text_type='ssml',
        voice_id='Joey',
        output_format='mp3',
        output_path='/tmp/blah'
    )


Parameters
----------

Among the parameters defined in Polly.srv, the following are supported while others are reserved for future.

* polly_action : currently only ``SynthesizeSpeech`` is supported
* text : the text to speak
* text_type : can be either ``text`` (default) or ``ssml``
* voice_id : any voice id supported by Amazon Polly, default is Joanna
* output_format : ogg (default), mp3 or pcm
* output_path : where the audio file is saved
* sample_rate : default is 16000 for pcm or 22050 for mp3 and ogg

The following are the reserved ones. Note that ``language_code`` is rarely needed (this may seem counter-intuitive).
See official Amazon Polly documentation for details (link can be found below).

* language_code
* lexicon_content
* lexicon_name
* lexicon_names
* speech_mark_types
* max_results
* next_token
* sns_topic_arn
* task_id
* task_status
* output_s3_bucket_name
* output_s3_key_prefix
* include_additional_language_codes


Links
-----

Amazon Polly documentation: https://docs.aws.amazon.com/polly/latest/dg/API_SynthesizeSpeech.html

Definition at line 96 of file amazonpolly.py.

Constructor & Destructor Documentation

def tts.amazonpolly.AmazonPolly.__init__ (   self,
  aws_access_key_id = None,
  aws_secret_access_key = None,
  aws_session_token = None,
  region_name = None 
)

Definition at line 182 of file amazonpolly.py.

Member Function Documentation

def tts.amazonpolly.AmazonPolly._dispatch (   self,
  request 
)
private
Amazon Polly supports a number of APIs. This will call the right one based on the content of request.

Currently "SynthesizeSpeech" is the only recognized action. Basically this method just delegates the work
to ``self._synthesize_speech_and_save`` and returns the result as is. It will simply raise if a different
action is passed in.

:param request: an instance of PollyRequest
:return: whatever returned by the delegate

Definition at line 321 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly._generate_user_agent_suffix (   self)
private

Definition at line 228 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly._get_polly_client (   self,
  aws_access_key_id = None,
  aws_secret_access_key = None,
  aws_session_token = None,
  region_name = None,
  with_service_model_patch = False 
)
private
Note we get a new botocore session each time this function is called.
This is to avoid potential problems caused by inner state of the session.

Definition at line 194 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly._make_audio_file_fullpath (   self,
  output_path,
  output_format 
)
private
Makes a full path for audio file based on given output path and format.

If ``output_path`` doesn't have a path, current path is used.

:param output_path: the output path received
:param output_format: the audio format, e.g., mp3, ogg_vorbis, pcm
:return: a full path for the output audio file. File ext will be constructed from audio format.

Definition at line 247 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly._node_request_handler (   self,
  request 
)
private
The callback function for processing service request.

It never raises. If anything unexpected happens, it will return a PollyResponse with details of the exception.

:param request: an instance of PollyRequest
:return: a PollyResponse

Definition at line 341 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly._pcm2wav (   self,
  audio_data,
  wav_filename,
  sample_rate 
)
private
per Amazon Polly official doc, the pcm in a signed 16-bit, 1 channel (mono), little-endian format.

Definition at line 238 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly._synthesize_speech_and_save (   self,
  request 
)
private
Calls Amazon Polly and writes the returned audio data to a local file.

To make it practical, three things will be returned in a JSON form string, which are audio file path,
audio type and Amazon Polly response metadata.

If the Amazon Polly call fails, audio file name will be an empty string and audio type will be "N/A".

Please see https://boto3.readthedocs.io/reference/services/polly.html#Polly.Client.synthesize_speech
for more details on Amazon Polly API.

:param request: an instance of PollyRequest
:return: a string in JSON form with two attributes, "Audio File" and "Amazon Polly Response".

Definition at line 268 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly.start (   self,
  node_name = 'polly_node',
  service_name = 'polly' 
)
The entry point of a ROS service node.

Details of the service API can be found in Polly.srv.

:param node_name: name of ROS node
:param service_name:  name of ROS service
:return: it doesn't return

Definition at line 388 of file amazonpolly.py.

def tts.amazonpolly.AmazonPolly.synthesize (   self,
  kws 
)
Call this method if you want to use polly but don't want to start a node.

:param kws: input as defined in Polly.srv
:return: a string in JSON form with detailed information, success or failure

Definition at line 379 of file amazonpolly.py.

Member Data Documentation

tts.amazonpolly.AmazonPolly.default_output_file_basename

Definition at line 191 of file amazonpolly.py.

tts.amazonpolly.AmazonPolly.default_output_folder

Definition at line 190 of file amazonpolly.py.

tts.amazonpolly.AmazonPolly.default_output_format

Definition at line 189 of file amazonpolly.py.

tts.amazonpolly.AmazonPolly.default_text_type

Definition at line 187 of file amazonpolly.py.

tts.amazonpolly.AmazonPolly.default_voice_id

Definition at line 188 of file amazonpolly.py.

tts.amazonpolly.AmazonPolly.polly

Definition at line 186 of file amazonpolly.py.


The documentation for this class was generated from the following file:


tts
Author(s): AWS RoboMaker
autogenerated on Fri Mar 5 2021 03:06:38