|
def | _dispatch (self, request) |
|
def | _generate_user_agent_suffix (self) |
|
def | _get_polly_client (self, aws_access_key_id=None, aws_secret_access_key=None, aws_session_token=None, region_name=None, with_service_model_patch=False) |
|
def | _make_audio_file_fullpath (self, output_path, output_format) |
|
def | _node_request_handler (self, request) |
|
def | _pcm2wav (self, audio_data, wav_filename, sample_rate) |
|
def | _synthesize_speech_and_save (self, request) |
|
A TTS engine that can be used in two different ways.
Usage
-----
1. It can run as a ROS service node.
Start a polly node::
$ rosrun tts polly_node.py
Call the service from command line::
$ rosservice call /polly SynthesizeSpeech 'hello polly' '' '' '' '' '' '' '' '' [] [] 0 '' '' '' '' '' '' false
Call the service programmatically::
from tts.srv import Polly
rospy.wait_for_service('polly')
polly = rospy.ServiceProxy('polly', Polly)
res = polly(**kw)
2. It can also be used as a normal python class::
AmazonPolly().synthesize(text='hi polly')
PollyRequest supports many parameters, but the majority of the users can safely ignore most of them and just
use the vanilla version which involves only one argument, ``text``.
If in some use cases more control is needed, SSML will come handy. Example::
AmazonPolly().synthesize(
text='<speak>Mary has a <amazon:effect name="whispered">little lamb.</amazon:effect></speak>',
text_type='ssml'
)
A user can also control the voice, output format and so on. Example::
AmazonPolly().synthesize(
text='<speak>Mary has a <amazon:effect name="whispered">little lamb.</amazon:effect></speak>',
text_type='ssml',
voice_id='Joey',
output_format='mp3',
output_path='/tmp/blah'
)
Parameters
----------
Among the parameters defined in Polly.srv, the following are supported while others are reserved for future.
* polly_action : currently only ``SynthesizeSpeech`` is supported
* text : the text to speak
* text_type : can be either ``text`` (default) or ``ssml``
* voice_id : any voice id supported by Amazon Polly, default is Joanna
* output_format : ogg (default), mp3 or pcm
* output_path : where the audio file is saved
* sample_rate : default is 16000 for pcm or 22050 for mp3 and ogg
The following are the reserved ones. Note that ``language_code`` is rarely needed (this may seem counter-intuitive).
See official Amazon Polly documentation for details (link can be found below).
* language_code
* lexicon_content
* lexicon_name
* lexicon_names
* speech_mark_types
* max_results
* next_token
* sns_topic_arn
* task_id
* task_status
* output_s3_bucket_name
* output_s3_key_prefix
* include_additional_language_codes
Links
-----
Amazon Polly documentation: https://docs.aws.amazon.com/polly/latest/dg/API_SynthesizeSpeech.html
Definition at line 96 of file amazonpolly.py.
def tts.amazonpolly.AmazonPolly._make_audio_file_fullpath |
( |
|
self, |
|
|
|
output_path, |
|
|
|
output_format |
|
) |
| |
|
private |
Makes a full path for audio file based on given output path and format.
If ``output_path`` doesn't have a path, current path is used.
:param output_path: the output path received
:param output_format: the audio format, e.g., mp3, ogg_vorbis, pcm
:return: a full path for the output audio file. File ext will be constructed from audio format.
Definition at line 247 of file amazonpolly.py.
def tts.amazonpolly.AmazonPolly._synthesize_speech_and_save |
( |
|
self, |
|
|
|
request |
|
) |
| |
|
private |
Calls Amazon Polly and writes the returned audio data to a local file.
To make it practical, three things will be returned in a JSON form string, which are audio file path,
audio type and Amazon Polly response metadata.
If the Amazon Polly call fails, audio file name will be an empty string and audio type will be "N/A".
Please see https://boto3.readthedocs.io/reference/services/polly.html#Polly.Client.synthesize_speech
for more details on Amazon Polly API.
:param request: an instance of PollyRequest
:return: a string in JSON form with two attributes, "Audio File" and "Amazon Polly Response".
Definition at line 268 of file amazonpolly.py.