Public Member Functions | |
def | __init__ |
def | bol |
def | check |
def | check_to |
def | check_until |
def | eol |
def | eos |
def | get |
def | match |
def | match_group |
def | match_groupdict |
def | match_groups |
def | match_info |
def | match_len |
def | match_pos |
def | matched |
def | peek |
def | pos |
def | pos |
def | post_match |
def | pre_match |
def | reset |
def | rest |
def | rest_len |
def | scan |
def | scan_to |
def | scan_until |
def | skip |
def | skip_bytes |
def | skip_lines |
def | skip_to |
def | skip_until |
def | skip_whitespace |
def | string |
def | string |
def | terminate |
def | unscan |
Public Attributes | |
pos | |
string | |
Private Member Functions | |
def | __check |
def | __match |
def | __match_info |
def | __matched_exception |
def | __rest |
Private Attributes | |
__index | |
__match_history | |
__regex_cache | |
__rest_gen | |
__src | |
__src_len |
A simple class to aid in lexical analysis of a string. Styled after, but not entirely the same as, Ruby's StringScanner class. Basic philosophy is simple: Scanner traverses a string left to right, consuming the string as it goes. The current position is the string pointer Scanner.pos. At each iteration, the caller uses the scanning methods to determine what the current piece of string actually is. Scanning methods: With the exception of get and peek, all scanning methods take a pattern and (optionally) flags (e.g re.X). The patterns are assumed to be either strings or compiled regular expression objects (i.e. the result of re.compile, or equivalent). If a pattern is not a string but does not implement match or search (whichever is being used), a ValueError is raised. String patterns are compiled and cached internally. The check, scan and skip methods all try to match *at* the current scan pointer. check_to, scan_to and skip_to all try to find a match somewhere beyond the scan pointer and jump *to* that position. check_until, scan_until, and skip_until are like *_to, but also consume the match (so the jump to the *end* of that position) Lookahead: check() check_to() check_until() peek() Consume: get() scan() scan_to() scan_until() skip() skip_to() skip_until() skip_bytes() (convenience wrapper) skip_lines() (convenience wrapper) skip_whitespace() (convenience wrapper) Note that scan* and check* both return either a string, in the case of a match, or None, in the case of no match. If the match exists but is zero length, the empty string is returned. Be careful handling this as both None and the empty string evaluate to False, but mean very different things. peek and get also return the empty string when the end of the stream is reached. Most recent match data: matched() -- True/False - was the most recent match a success? The following methods all throw Exception if not matched() match() -- matched string match_len() -- matched string length match_pos() -- offset of match Wrappers around re.* match_info() -- the re.MatchObject match_group() match_groups() match_groupdict() pre_match() -- string preceding the match post_match() -- string following the match Misc: pos -- get/set current scan pointer position bol() -- beginning of line? (DOS/Unix/Mac aware) eol() -- end of line? (DOS/Unix/Mac aware) eos() -- end of string? rest() -- remaining (unconsumed) string rest_len() -- length of remaining string unscan() -- revert to previous state Setup: string -- get/set current source string reset() -- reset the scanner ready to start again terminate() -- trigger premature finish
Definition at line 31 of file scanner.py.
def scanner.Scanner.__init__ | ( | self, | |
src = None |
|||
) |
Constructor Arguments: src -- a string to scan. This can be set later by string()
Definition at line 117 of file scanner.py.
def scanner.Scanner.__check | ( | self, | |
pattern, | |||
flags, | |||
consume = False , |
|||
log = True , |
|||
search_func = 'match' , |
|||
consume_match = True |
|||
) | [private] |
Perform a match and return the matching substring or None Arguments: pattern -- the regex pattern to look for (as string or compiled) flags -- the regex flags to use in the match, as defined in the re module consume -- whether or not to consume the matching string log -- whether or not to write to the __match_history search_func -- Either 'match' or 'search'. The former looks for matches immediately at the beginning of the string pointer, the latter will look for matches anywhere after the string pointer. consume_match -- If consume is True, this sets that the full text of the match should be consumed as well as what preceded it up until that match
Definition at line 367 of file scanner.py.
def scanner.Scanner.__match | ( | self, | |
strict = True |
|||
) | [private] |
Return the most recent match data. Raise Exception if no matches are known. This method is used by most of the matched_*, and the exception should be allowed to propagate back to the caller
Definition at line 213 of file scanner.py.
def scanner.Scanner.__match_info | ( | self, | |
strict = True |
|||
) | [private] |
Definition at line 266 of file scanner.py.
def scanner.Scanner.__matched_exception | ( | self | ) | [private] |
raise an exception if the most recent match failed
Definition at line 232 of file scanner.py.
def scanner.Scanner.__rest | ( | self | ) | [private] |
Return the rest of the string
Definition at line 331 of file scanner.py.
def scanner.Scanner.bol | ( | self | ) |
Return whether or not the scan pointer is immediately after a newline character (DOS/Unix/Mac aware), or at the start of the string.
Definition at line 197 of file scanner.py.
def scanner.Scanner.check | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Return a match for the pattern (or None) at the scan pointer without actually consuming the string If the pattern matched but was zero length, the empty string is returned If the pattern did not match, None is returned
Definition at line 438 of file scanner.py.
def scanner.Scanner.check_to | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Return all text up until the beginning of the first match for the pattern after the scan pointer without consuming the string If the pattern matched but was zero length, the empty string is returned If the pattern did not match, None is returned
Definition at line 447 of file scanner.py.
def scanner.Scanner.check_until | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Return all text up until the end of the first match for the pattern after the scan pointer without consuming the string If the pattern matched but was zero length, the empty string is returned If the pattern did not match, None is returned
Definition at line 455 of file scanner.py.
def scanner.Scanner.eol | ( | self | ) |
Return whether or not the scan pointer is immediately before a newline character (DOS/Unix/Mac aware) or at the end of the string.
Definition at line 203 of file scanner.py.
def scanner.Scanner.eos | ( | self | ) |
Return True iff we are at the end of the string, else False.
Definition at line 166 of file scanner.py.
def scanner.Scanner.get | ( | self, | |
length = 1 |
|||
) |
Return the given number of characters from the current string pointer and consume them If we reach the end of the stream, the empty string is returned
Definition at line 540 of file scanner.py.
def scanner.Scanner.match | ( | self | ) |
Return the last matching string Raise Exception if no match attempts have been recorded. Raise Exception if most recent match failed
Definition at line 238 of file scanner.py.
def scanner.Scanner.match_group | ( | self, | |
args | |||
) |
Return the contents of the given group in the most recent match. This is a wrapper to re.MatchObject.group() raise IndexError if the match exists but the group does not raise Exception if no match attempts have been recorded raise Exception if most recent match failed
Definition at line 302 of file scanner.py.
def scanner.Scanner.match_groupdict | ( | self, | |
default = None |
|||
) |
Return a dict containing group_name => match. This is a wrapper to re.MatchObject.groupdict() and as such it only works for named groups Raise Exception if no match attempts have been recorded. Raise Exception if most recent match failed
Definition at line 291 of file scanner.py.
def scanner.Scanner.match_groups | ( | self, | |
default = None |
|||
) |
Return the most recent's match's groups, this is a wrapper to re.MatchObject.groups() Raise Exception if no match attempts have been recorded. Raise Exception if most recent match failed
Definition at line 282 of file scanner.py.
def scanner.Scanner.match_info | ( | self | ) |
Return the most recent match's MatchObject. This is what's returned by the re module. Use this if the other methods here don't expose what you need. Raise Exception if no match attempts have been recorded. Raise Exception if most recent match failed
Definition at line 272 of file scanner.py.
def scanner.Scanner.match_len | ( | self | ) |
Return the length of the last matching string This is equivalent to len(scanner.match()). Raise Exception if no match attempts have been recorded. Raise Exception if most recent match failed
Definition at line 246 of file scanner.py.
def scanner.Scanner.match_pos | ( | self | ) |
Return the offset into the string of the last match Raise Exception if no match attempts have been recorded. Raise Exception if most recent match failed
Definition at line 257 of file scanner.py.
def scanner.Scanner.matched | ( | self | ) |
Return True if the last match was successful, else False. Raise Exception if no match attempts have been recorded.
Definition at line 227 of file scanner.py.
def scanner.Scanner.peek | ( | self, | |
length = 1 |
|||
) |
Return the given number of characters from the current string pointer without consuming them. If we reach the end of the stream, the empty string is returned
Definition at line 534 of file scanner.py.
def scanner.Scanner.pos | ( | self | ) |
The current string pointer position.
Definition at line 148 of file scanner.py.
def scanner.Scanner.pos | ( | self, | |
new_pos | |||
) |
Set the string pointer position. Arguments: new_pos -- The new offset into the string Throw Exception if new_pos is out of range
Definition at line 153 of file scanner.py.
def scanner.Scanner.post_match | ( | self | ) |
Return the string following the last match or None. This is equivalent to: scanner.string[scanner.match_pos() + scanner.match_len() : ] raise Exception if no match attempts have been recorded
Definition at line 323 of file scanner.py.
def scanner.Scanner.pre_match | ( | self | ) |
Return the string preceding the last match or None. This is equivalent to: scanner.string[:scanner.match_pos()] raise Exception if no match attempts have been recorded
Definition at line 315 of file scanner.py.
def scanner.Scanner.reset | ( | self | ) |
Reset the scanner's state including string pointer and match history.
Definition at line 170 of file scanner.py.
def scanner.Scanner.rest | ( | self | ) |
Return the string from the current pointer onwards, i.e. the segment of string which has not yet been consumed.
Definition at line 345 of file scanner.py.
def scanner.Scanner.rest_len | ( | self | ) |
Return the length of string remaining. This is equivalent to len(rest())
Definition at line 352 of file scanner.py.
def scanner.Scanner.scan | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Return a match for the pattern at the scan pointer and consume the string. Return None if not match is found
Definition at line 463 of file scanner.py.
def scanner.Scanner.scan_to | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Return all text up until the beginning of the first match for the pattern after the scan pointer. The pattern is not included in the match. The scan pointer will be moved such that it immediately precedes the pattern Return None if no match is found
Definition at line 469 of file scanner.py.
def scanner.Scanner.scan_until | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Return the first match for the pattern after the scan pointer and consumes the string up until the end of the match. Return None if no match is found
Definition at line 478 of file scanner.py.
def scanner.Scanner.skip | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Scan ahead over the given pattern and return how many characters were consumed, or None. Similar to scan, but does not return the string or record the match
Definition at line 484 of file scanner.py.
def scanner.Scanner.skip_bytes | ( | self, | |
n | |||
) |
Skip the given number of bytes and return the number of bytes consumed
Definition at line 517 of file scanner.py.
def scanner.Scanner.skip_lines | ( | self, | |
n = 1 |
|||
) |
Skip the given number of lines and return the number of lines consumed
Definition at line 511 of file scanner.py.
def scanner.Scanner.skip_to | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Scan ahead until the beginning of first occurrance of the given pattern and return how many characters were skipped, or None if the match failed The match is not recorded.
Definition at line 491 of file scanner.py.
def scanner.Scanner.skip_until | ( | self, | |
pattern, | |||
flags = 0 |
|||
) |
Scan ahead until the end of first occurrance of the given pattern and return how many characters were consumed, or None if the match failed The match is not recorded
Definition at line 502 of file scanner.py.
def scanner.Scanner.skip_whitespace | ( | self, | |
n = None , |
|||
multiline = True |
|||
) |
Skip over whitespace characters and return the number of characters consumed Arguments: n -- maximum number of characters to cosume (default None) multiline -- whether or not to consume newline characters (default True)
Definition at line 521 of file scanner.py.
def scanner.Scanner.string | ( | self | ) |
The source string
Definition at line 178 of file scanner.py.
def scanner.Scanner.string | ( | self, | |
s | |||
) |
Set the source string
Definition at line 183 of file scanner.py.
def scanner.Scanner.terminate | ( | self | ) |
Set the string pointer to the end of the input and clear the match history.
Definition at line 191 of file scanner.py.
def scanner.Scanner.unscan | ( | self | ) |
Revert the scanner's state to that of the previous match. Only one previous state is remembered Throw Exception if there is no previous known state to restore
Definition at line 357 of file scanner.py.
scanner.Scanner::__index [private] |
Definition at line 122 of file scanner.py.
scanner.Scanner::__match_history [private] |
Definition at line 122 of file scanner.py.
scanner.Scanner::__regex_cache [private] |
Definition at line 122 of file scanner.py.
scanner.Scanner::__rest_gen [private] |
Definition at line 122 of file scanner.py.
scanner.Scanner::__src [private] |
Definition at line 122 of file scanner.py.
scanner.Scanner::__src_len [private] |
Definition at line 122 of file scanner.py.
Definition at line 170 of file scanner.py.
Definition at line 122 of file scanner.py.