Class for efficiently matching a bag-of-words representation of a document (image) against a database of known documents. More...
#include <database.h>
| Classes | |
| struct | WordFrequency | 
| Public Member Functions | |
| void | computeTfIdfWeights (float default_weight=1.0f) | 
| Compute the TF-IDF weights of all the words. To be called after inserting a corpus of training examples into the database. | |
| Database (uint32_t num_words=0) | |
| Constructor. | |
| void | find (const std::vector< Word > &document, size_t N, std::vector< Match > &matches) const | 
| Find the top N matches in the database for the query document. | |
| DocId | findAndInsert (const std::vector< Word > &document, size_t N, std::vector< Match > &matches) | 
| Find the top N matches, then insert the query document. | |
| DocId | insert (const std::vector< Word > &document) | 
| Insert a new document. | |
| void | loadWeights (const std::string &file) | 
| Load the vocabulary word weights from a file. | |
| void | saveWeights (const std::string &file) const | 
| Save the vocabulary word weights to a file. | |
| Private Types | |
| typedef std::map< Word, float > | DocumentVector | 
| typedef std::vector < WordFrequency > | InvertedFile | 
| Private Member Functions | |
| void | computeVector (const std::vector< Word > &document, DocumentVector &v) const | 
| Static Private Member Functions | |
| static void | normalize (DocumentVector &v) | 
| static float | sparseDistance (const DocumentVector &v1, const DocumentVector &v2) | 
| Private Attributes | |
| std::vector< DocumentVector > | database_vectors_ | 
| std::vector< InvertedFile > | word_files_ | 
| std::vector< float > | word_weights_ | 
Class for efficiently matching a bag-of-words representation of a document (image) against a database of known documents.
Definition at line 40 of file database.h.
| typedef std::map<Word, float> vt::Database::DocumentVector  [private] | 
Definition at line 110 of file database.h.
| typedef std::vector<WordFrequency> vt::Database::InvertedFile  [private] | 
Definition at line 106 of file database.h.
| vt::Database::Database | ( | uint32_t | num_words = 0 | ) | 
Constructor.
If computing weights for a new vocabulary, num_words should be the size of the vocabulary. If calling loadWeights(), it can be left zero. 
Definition at line 11 of file database.cpp.
| void vt::Database::computeTfIdfWeights | ( | float | default_weight = 1.0f | ) | 
Compute the TF-IDF weights of all the words. To be called after inserting a corpus of training examples into the database.
| default_weight | The default weight of a word that appears in none of the training documents. | 
Definition at line 67 of file database.cpp.
| void vt::Database::computeVector | ( | const std::vector< Word > & | document, | 
| DocumentVector & | v | ||
| ) | const  [private] | 
Definition at line 106 of file database.cpp.
| void vt::Database::find | ( | const std::vector< Word > & | document, | 
| size_t | N, | ||
| std::vector< Match > & | matches | ||
| ) | const | 
Find the top N matches in the database for the query document.
| document | The query document, a set of quantized words. | |
| N | The number of matches to return. | |
| [out] | matches | IDs and scores for the top N matching database documents. | 
Definition at line 39 of file database.cpp.
| DocId vt::Database::findAndInsert | ( | const std::vector< Word > & | document, | 
| size_t | N, | ||
| std::vector< Match > & | matches | ||
| ) | 
Find the top N matches, then insert the query document.
This is equivalent to calling find() followed by insert(), but may be more efficient.
| document | The document to match then insert, a set of quantized words. | |
| N | The number of matches to return. | |
| [out] | matches | IDs and scores for the top N matching database documents. | 
Definition at line 60 of file database.cpp.
| DocId vt::Database::insert | ( | const std::vector< Word > & | document | ) | 
Insert a new document.
| document | The set of quantized words in a document/image. | 
Definition at line 17 of file database.cpp.
| void vt::Database::loadWeights | ( | const std::string & | file | ) | 
Load the vocabulary word weights from a file.
Definition at line 88 of file database.cpp.
| void vt::Database::normalize | ( | DocumentVector & | v | ) |  [static, private] | 
Definition at line 115 of file database.cpp.
| void vt::Database::saveWeights | ( | const std::string & | file | ) | const | 
Save the vocabulary word weights to a file.
Definition at line 80 of file database.cpp.
| float vt::Database::sparseDistance | ( | const DocumentVector & | v1, | 
| const DocumentVector & | v2 | ||
| ) |  [static, private] | 
Definition at line 125 of file database.cpp.
| std::vector<DocumentVector> vt::Database::database_vectors_  [private] | 
Definition at line 114 of file database.h.
| std::vector<InvertedFile> vt::Database::word_files_  [private] | 
Definition at line 112 of file database.h.
| std::vector<float> vt::Database::word_weights_  [private] | 
Definition at line 113 of file database.h.