#include <resultiterator.h>

Public Member Functions | |
| virtual void | Begin () |
| virtual char * | GetUTF8Text (PageIteratorLevel level) const |
| virtual bool | IsAtBeginningOf (PageIteratorLevel level) const |
| virtual bool | IsAtFinalElement (PageIteratorLevel level, PageIteratorLevel element) const |
| virtual bool | Next (PageIteratorLevel level) |
| bool | ParagraphIsLtr () const |
| virtual | ~ResultIterator () |
Static Public Member Functions | |
| static void | CalculateTextlineOrder (bool paragraph_is_ltr, const GenericVector< StrongScriptDirection > &word_dirs, GenericVectorEqEq< int > *reading_order) |
| static ResultIterator * | StartOfParagraph (const LTRResultIterator &resit) |
Static Public Attributes | |
| static const int | kComplexWord |
| static const int | kMinorRunEnd |
| static const int | kMinorRunStart |
Protected Member Functions | |
| TESS_LOCAL | ResultIterator (const LTRResultIterator &resit) |
Private Member Functions | |
| void | AppendSuffixMarks (STRING *text) const |
| void | AppendUTF8ParagraphText (STRING *text) const |
| void | AppendUTF8WordText (STRING *text) const |
| bool | BidiDebug (int min_level) const |
| void | CalculateBlobOrder (GenericVector< int > *blob_indices) const |
| void | CalculateTextlineOrder (bool paragraph_is_ltr, const LTRResultIterator &resit, GenericVector< StrongScriptDirection > *ssd, GenericVectorEqEq< int > *indices) const |
| void | CalculateTextlineOrder (bool paragraph_is_ltr, const LTRResultIterator &resit, GenericVectorEqEq< int > *indices) const |
| bool | CurrentParagraphIsLtr () const |
| bool | IsAtFinalSymbolOfWord () const |
| bool | IsAtFirstSymbolOfWord () const |
| void | IterateAndAppendUTF8TextlineText (STRING *text) |
| int | LTRWordIndex () const |
| void | MoveToLogicalStartOfTextline () |
| void | MoveToLogicalStartOfWord () |
Private Attributes | |
| bool | at_beginning_of_minor_run_ |
| bool | current_paragraph_is_ltr_ |
| bool | in_minor_direction_ |
Definition at line 37 of file resultiterator.h.
| virtual tesseract::ResultIterator::~ResultIterator | ( | ) | [inline, virtual] |
ResultIterator is copy constructible! The default copy constructor works just fine for us.
Definition at line 45 of file resultiterator.h.
| TESS_LOCAL tesseract::ResultIterator::ResultIterator | ( | const LTRResultIterator & | resit | ) | [explicit, protected] |
We presume the data associated with the given iterator will outlive us. NB: This is private because it does something that is non-obvious: it resets to the beginning of the paragraph instead of staying wherever resit might have pointed.
| void tesseract::ResultIterator::AppendSuffixMarks | ( | STRING * | text | ) | const [private] |
Append any extra marks that should be appended to this word when printed. Mostly, these are Unicode BiDi control characters.
| void tesseract::ResultIterator::AppendUTF8ParagraphText | ( | STRING * | text | ) | const [private] |
Appends the text of the current paragraph in reading order to the given buffer. Each textline is terminated in a single newline character, and the paragraph gets an extra newline at the end.
| void tesseract::ResultIterator::AppendUTF8WordText | ( | STRING * | text | ) | const [private] |
Appends the current word in reading order to the given buffer.
| virtual void tesseract::ResultIterator::Begin | ( | ) | [virtual] |
Moves the iterator to point to the start of the page to begin an iteration.
Reimplemented from tesseract::PageIterator.
| bool tesseract::ResultIterator::BidiDebug | ( | int | min_level | ) | const [private] |
Returns whether the bidi_debug flag is set to at least min_level.
| void tesseract::ResultIterator::CalculateBlobOrder | ( | GenericVector< int > * | blob_indices | ) | const [private] |
Given an iterator pointing at a word, returns the logical reading order of blob indices for the word.
| void tesseract::ResultIterator::CalculateTextlineOrder | ( | bool | paragraph_is_ltr, | |
| const LTRResultIterator & | resit, | |||
| GenericVector< StrongScriptDirection > * | ssd, | |||
| GenericVectorEqEq< int > * | indices | |||
| ) | const [private] |
Same as above, but the caller's ssd gets filled in if ssd != NULL.
| void tesseract::ResultIterator::CalculateTextlineOrder | ( | bool | paragraph_is_ltr, | |
| const LTRResultIterator & | resit, | |||
| GenericVectorEqEq< int > * | indices | |||
| ) | const [private] |
Returns word indices as measured from resit->RestartRow() = index 0 for the reading order of words within a textline given an iterator into the middle of the text line. In addition to non-negative word indices, the following negative values may be inserted: kMinorRunStart Start of minor direction text. kMinorRunEnd End of minor direction text. kComplexWord The previous word contains both left-to-right and right-to-left characters and was treated as neutral.
| static void tesseract::ResultIterator::CalculateTextlineOrder | ( | bool | paragraph_is_ltr, | |
| const GenericVector< StrongScriptDirection > & | word_dirs, | |||
| GenericVectorEqEq< int > * | reading_order | |||
| ) | [static] |
Yields the reading order as a sequence of indices and (optional) meta-marks for a set of words (given left-to-right). The meta marks are passed as negative values: kMinorRunStart Start of minor direction text. kMinorRunEnd End of minor direction text. kComplexWord The next indexed word contains both left-to-right and right-to-left characters and was treated as neutral.
For example, suppose we have five words in a text line, indexed [0,1,2,3,4] from the leftmost side of the text line. The following are all believable reading_orders:
Left-to-Right (in ltr paragraph): { 0, 1, 2, 3, 4 } Left-to-Right (in rtl paragraph): { kMinorRunStart, 0, 1, 2, 3, 4, kMinorRunEnd } Right-to-Left (in rtl paragraph): { 4, 3, 2, 1, 0 } Left-to-Right except for an RTL phrase in words 2, 3 in an ltr paragraph: { 0, 1, kMinorRunStart, 3, 2, kMinorRunEnd, 4 }
| bool tesseract::ResultIterator::CurrentParagraphIsLtr | ( | ) | const [private] |
Calculates the current paragraph's dominant writing direction. Typically, members should use current_paragraph_ltr_ instead.
| virtual char* tesseract::ResultIterator::GetUTF8Text | ( | PageIteratorLevel | level | ) | const [virtual] |
Returns the null terminated UTF-8 encoded text string for the current object at the given level. Use delete [] to free after use.
Reimplemented from tesseract::LTRResultIterator.
| virtual bool tesseract::ResultIterator::IsAtBeginningOf | ( | PageIteratorLevel | level | ) | const [virtual] |
IsAtBeginningOf() returns whether we're at the logical beginning of the given level. (as opposed to ResultIterator's left-to-right top-to-bottom order). Otherwise, this acts the same as PageIterator::IsAtBeginningOf(). For a full description, see pageiterator.h
Reimplemented from tesseract::PageIterator.
| virtual bool tesseract::ResultIterator::IsAtFinalElement | ( | PageIteratorLevel | level, | |
| PageIteratorLevel | element | |||
| ) | const [virtual] |
Implement PageIterator's IsAtFinalElement correctly in a BiDi context. For instance, IsAtFinalElement(RIL_PARA, RIL_WORD) returns whether we point at the last word in a paragraph. See PageIterator for full comment.
Reimplemented from tesseract::PageIterator.
| bool tesseract::ResultIterator::IsAtFinalSymbolOfWord | ( | ) | const [private] |
Are we pointing at the final (reading order) symbol of the word?
| bool tesseract::ResultIterator::IsAtFirstSymbolOfWord | ( | ) | const [private] |
Are we pointing at the first (reading order) symbol of the word?
| void tesseract::ResultIterator::IterateAndAppendUTF8TextlineText | ( | STRING * | text | ) | [private] |
Appends the text of the current text line, *assuming this iterator is positioned at the beginning of the text line* This function updates the iterator to point to the first position past the text line. Each textline is terminated in a single newline character. If the textline ends a paragraph, it gets a second terminal newline.
| int tesseract::ResultIterator::LTRWordIndex | ( | ) | const [private] |
What is the index of the current word in a strict left-to-right reading of the row?
| void tesseract::ResultIterator::MoveToLogicalStartOfTextline | ( | ) | [private] |
Precondition: current_paragraph_is_ltr_ is set.
| void tesseract::ResultIterator::MoveToLogicalStartOfWord | ( | ) | [private] |
Precondition: current_paragraph_is_ltr_ and in_minor_direction_ are set.
| virtual bool tesseract::ResultIterator::Next | ( | PageIteratorLevel | level | ) | [virtual] |
Moves to the start of the next object at the given level in the page hierarchy in the appropriate reading order and returns false if the end of the page was reached. NOTE that RIL_SYMBOL will skip non-text blocks, but all other PageIteratorLevel level values will visit each non-text block once. Think of non text blocks as containing a single para, with a single line, with a single imaginary word. Calls to Next with different levels may be freely intermixed. This function iterates words in right-to-left scripts correctly, if the appropriate language has been loaded into Tesseract.
Reimplemented from tesseract::PageIterator.
| bool tesseract::ResultIterator::ParagraphIsLtr | ( | ) | const |
Return whether the current paragraph's dominant reading direction is left-to-right (as opposed to right-to-left).
| static ResultIterator* tesseract::ResultIterator::StartOfParagraph | ( | const LTRResultIterator & | resit | ) | [static] |
bool tesseract::ResultIterator::at_beginning_of_minor_run_ [private] |
Is the currently pointed-at character at the beginning of a minor-direction run?
Definition at line 229 of file resultiterator.h.
bool tesseract::ResultIterator::current_paragraph_is_ltr_ [private] |
Definition at line 223 of file resultiterator.h.
bool tesseract::ResultIterator::in_minor_direction_ [private] |
Is the currently pointed-at character in a minor-direction sequence?
Definition at line 232 of file resultiterator.h.
const int tesseract::ResultIterator::kComplexWord [static] |
Definition at line 129 of file resultiterator.h.
const int tesseract::ResultIterator::kMinorRunEnd [static] |
Definition at line 128 of file resultiterator.h.
const int tesseract::ResultIterator::kMinorRunStart [static] |
Definition at line 127 of file resultiterator.h.