Classes | |
class | _ReplaceableNumber |
Functions | |
def | _convert_words_to_numbers (text, short_scale=True, ordinals=False) |
def | _extract_decimal_with_text_en (tokens, short_scale, ordinals) |
def | _extract_fraction_with_text_en (tokens, short_scale, ordinals) |
def | _extract_number_with_text_en (tokens, short_scale=True, ordinals=False, fractional_numbers=True) |
def | _extract_number_with_text_en_helper (tokens, short_scale=True, ordinals=False, fractional_numbers=True) |
def | _extract_numbers_with_text (tokens, short_scale=True, ordinals=False, fractional_numbers=True) |
def | _extract_whole_number_with_text_en (tokens, short_scale, ordinals) |
def | _generate_plurals (originals) |
def | _initialize_number_data (short_scale) |
def | _invert_dict (original) |
def | _partition_list (items, split_on) |
def | _tokenize (text) |
def | extract_datetime_en (string, dateNow, default_time) |
def | extract_duration_en (text) |
def | extract_numbers_en (text, short_scale=True, ordinals=False) |
def | extractnumber_en (text, short_scale=True, ordinals=False) |
def | isFractional_en (input_str, short_scale=True) |
def | normalize_en (text, remove_articles) |
Variables | |
dictionary | _DECIMAL_MARKER = {"point", "dot"} |
dictionary | _FRACTION_MARKER = {"and"} |
_MULTIPLIES_LONG_SCALE_EN = set(_LONG_SCALE_EN.values())|\ | |
_MULTIPLIES_SHORT_SCALE_EN = set(_SHORT_SCALE_EN.values())|\ | |
dictionary | _NEGATIVES = {"negative", "minus"} |
_STRING_LONG_ORDINAL_EN = _invert_dict(_LONG_ORDINAL_STRING_EN) | |
_STRING_NUM_EN = _invert_dict(_NUM_STRING_EN) | |
_STRING_SHORT_ORDINAL_EN = _invert_dict(_SHORT_ORDINAL_STRING_EN) | |
dictionary | _SUMS |
_Token = namedtuple('_Token', 'word index') | |
|
private |
Convert words in a string into their equivalent numbers. Args: text str: short_scale boolean: True if short scale numbers should be used. ordinals boolean: True if ordinals (e.g. first, second, third) should be parsed to their number values (1, 2, 3...) Returns: str The original text, with numbers subbed in where appropriate.
Definition at line 189 of file parse_en.py.
|
private |
Extract decimal numbers from a string. This function handles text such as '2 point 5'. Notes: While this is a helper for extractnumber_en, it also depends on extractnumber_en, to parse out the components of the decimal. This does not currently handle things like: number dot number number number Args: tokens [_Token]: The text to parse. short_scale boolean: ordinals boolean: Returns: (float, [_Token]) The value found and relevant tokens. (None, None) if no decimal value is found.
Definition at line 366 of file parse_en.py.
|
private |
Extract fraction numbers from a string. This function handles text such as '2 and 3/4'. Note that "one half" or similar will be parsed by the whole number function. Args: tokens [_Token]: words and their indexes in the original string. short_scale boolean: ordinals boolean: Returns: (int or float, [_Token]) The value found, and the list of relevant tokens. (None, None) if no fraction value is found.
Definition at line 324 of file parse_en.py.
|
private |
This function extracts a number from a list of _Tokens. Args: tokens str: the string to normalize short_scale (bool): use short scale if True, long scale if False ordinals (bool): consider ordinal numbers, third=3 instead of 1/3 fractional_numbers (bool): True if we should look for fractions and decimals. Returns: _ReplaceableNumber
Definition at line 268 of file parse_en.py.
|
private |
Helper for _extract_number_with_text_en. This contains the real logic for parsing, but produces a result that needs a little cleaning (specific, it may contain leading articles that can be trimmed off). Args: tokens [_Token]: short_scale boolean: ordinals boolean: fractional_numbers boolean: Returns: int or float, [_Tokens]
Definition at line 292 of file parse_en.py.
|
private |
Extract all numbers from a list of _Tokens, with the words that represent them. Args: [_Token]: The tokens to parse. short_scale bool: True if short scale numbers should be used, False for long scale. True by default. ordinals bool: True if ordinal words (first, second, third, etc) should be parsed. fractional_numbers bool: True if we should look for fractions and decimals. Returns: [_ReplaceableNumber]: A list of tuples, each containing a number and a string.
Definition at line 226 of file parse_en.py.
|
private |
Handle numbers not handled by the decimal or fraction functions. This is generally whole numbers. Note that phrases such as "one half" will be handled by this function, while "one and a half" are handled by the fraction function. Args: tokens [_Token]: short_scale boolean: ordinals boolean: Returns: int or float, [_Tokens] The value parsed, and tokens that it corresponds to.
Definition at line 414 of file parse_en.py.
|
private |
Return a new set or dict containing the original values, all with 's' appended to them. Args: originals set(str) or dict(str, any): values to pluralize Returns: set(str) or dict(str, any)
Definition at line 45 of file parse_en.py.
|
private |
Generate dictionaries of words to numbers, based on scale. This is a helper function for _extract_whole_number. Args: short_scale boolean: Returns: (set(str), dict(str, number), dict(str, number)) multiplies, string_num_ordinal, string_num_scale
Definition at line 565 of file parse_en.py.
|
private |
Produce a dictionary with the keys and values inverted, relative to the dict passed in. Args: original dict: The dict like object to invert Returns: dict
Definition at line 30 of file parse_en.py.
|
private |
Partition a list of items. Works similarly to str.partition Args: items: split_on callable: Should return a boolean. Each item will be passed to this callable in succession, and partitions will be created any time it returns True. Returns: [[any]]
Definition at line 159 of file parse_en.py.
|
private |
Generate a list of token object, given a string. Args: text str: Text to tokenize. Returns: [_Token]
Definition at line 146 of file parse_en.py.
def mycroft.util.lang.parse_en.extract_datetime_en | ( | string, | |
dateNow, | |||
default_time | |||
) |
Convert a human date reference into an exact datetime Convert things like "today" "tomorrow afternoon" "next Tuesday at 4pm" "August 3rd" into a datetime. If a reference date is not provided, the current local time is used. Also consumes the words used to define the date returning the remaining string. For example, the string "what is Tuesday's weather forecast" returns the date for the forthcoming Tuesday relative to the reference date and the remainder string "what is weather forecast". Args: string (str): string containing date words dateNow (datetime): A reference date/time for "tommorrow", etc default_time (time): Time to set if no time was found in the string Returns: [datetime, str]: An array containing the datetime and the remaining text not consumed in the parsing, or None if no date or time related text was found.
Definition at line 667 of file parse_en.py.
def mycroft.util.lang.parse_en.extract_duration_en | ( | text | ) |
Convert an english phrase into a number of seconds Convert things like: "10 minute" "2 and a half hours" "3 days 8 hours 10 minutes and 49 seconds" into an int, representing the total number of seconds. The words used in the duration will be consumed, and the remainder returned. As an example, "set a timer for 5 minutes" would return (300, "set a timer for"). Args: text (str): string containing a duration Returns: (timedelta, str): A tuple containing the duration and the remaining text not consumed in the parsing. The first value will be None if no duration is found. The text returned will have whitespace stripped from the ends.
Definition at line 612 of file parse_en.py.
def mycroft.util.lang.parse_en.extract_numbers_en | ( | text, | |
short_scale = True , |
|||
ordinals = False |
|||
) |
Takes in a string and extracts a list of numbers. Args: text (str): the string to extract a number from short_scale (bool): Use "short scale" or "long scale" for large numbers -- over a million. The default is short scale, which is now common in most English speaking countries. See https://en.wikipedia.org/wiki/Names_of_large_numbers ordinals (bool): consider ordinal numbers, e.g. third=3 instead of 1/3 Returns: list: list of extracted numbers as floats
Definition at line 1476 of file parse_en.py.
def mycroft.util.lang.parse_en.extractnumber_en | ( | text, | |
short_scale = True , |
|||
ordinals = False |
|||
) |
This function extracts a number from a text string, handles pronunciations in long scale and short scale https://en.wikipedia.org/wiki/Names_of_large_numbers Args: text (str): the string to normalize short_scale (bool): use short scale if True, long scale if False ordinals (bool): consider ordinal numbers, third=3 instead of 1/3 Returns: (int) or (float) or False: The extracted number or False if no number was found
Definition at line 592 of file parse_en.py.
def mycroft.util.lang.parse_en.isFractional_en | ( | input_str, | |
short_scale = True |
|||
) |
This function takes the given text and checks if it is a fraction. Args: input_str (str): the string to check if fractional short_scale (bool): use short scale if True, long scale if False Returns: (bool) or (float): False if not a fraction, otherwise the fraction
Definition at line 1447 of file parse_en.py.
def mycroft.util.lang.parse_en.normalize_en | ( | text, | |
remove_articles | |||
) |
English string normalization
Definition at line 1495 of file parse_en.py.
|
private |
Definition at line 80 of file parse_en.py.
|
private |
Definition at line 77 of file parse_en.py.
|
private |
Definition at line 69 of file parse_en.py.
|
private |
Definition at line 72 of file parse_en.py.
|
private |
Definition at line 63 of file parse_en.py.
|
private |
Definition at line 91 of file parse_en.py.
|
private |
Definition at line 82 of file parse_en.py.
|
private |
Definition at line 90 of file parse_en.py.
|
private |
Definition at line 66 of file parse_en.py.
|
private |
Definition at line 98 of file parse_en.py.