Class for efficiently matching a bag-of-words representation of a document (image) against a database of known documents. More...
#include <database.h>
Classes | |
| struct | WordFrequency |
Public Member Functions | |
| void | computeTfIdfWeights (float default_weight=1.0f) |
| Compute the TF-IDF weights of all the words. To be called after inserting a corpus of training examples into the database. | |
| Database (uint32_t num_words=0) | |
| Constructor. | |
| void | find (const std::vector< Word > &document, size_t N, std::vector< Match > &matches) const |
| Find the top N matches in the database for the query document. | |
| DocId | findAndInsert (const std::vector< Word > &document, size_t N, std::vector< Match > &matches) |
| Find the top N matches, then insert the query document. | |
| DocId | insert (const std::vector< Word > &document) |
| Insert a new document. | |
| void | loadWeights (const std::string &file) |
| Load the vocabulary word weights from a file. | |
| void | saveWeights (const std::string &file) const |
| Save the vocabulary word weights to a file. | |
Private Types | |
| typedef std::map< Word, float > | DocumentVector |
| typedef std::vector < WordFrequency > | InvertedFile |
Private Member Functions | |
| void | computeVector (const std::vector< Word > &document, DocumentVector &v) const |
Static Private Member Functions | |
| static void | normalize (DocumentVector &v) |
| static float | sparseDistance (const DocumentVector &v1, const DocumentVector &v2) |
Private Attributes | |
| std::vector< DocumentVector > | database_vectors_ |
| std::vector< InvertedFile > | word_files_ |
| std::vector< float > | word_weights_ |
Class for efficiently matching a bag-of-words representation of a document (image) against a database of known documents.
Definition at line 38 of file database.h.
typedef std::map<Word, float> vt::Database::DocumentVector [private] |
Definition at line 108 of file database.h.
typedef std::vector<WordFrequency> vt::Database::InvertedFile [private] |
Definition at line 104 of file database.h.
| vt::Database::Database | ( | uint32_t | num_words = 0 |
) |
Constructor.
If computing weights for a new vocabulary, num_words should be the size of the vocabulary. If calling loadWeights(), it can be left zero.
Definition at line 4 of file database.cpp.
| void vt::Database::computeTfIdfWeights | ( | float | default_weight = 1.0f |
) |
Compute the TF-IDF weights of all the words. To be called after inserting a corpus of training examples into the database.
| default_weight | The default weight of a word that appears in none of the training documents. |
Definition at line 60 of file database.cpp.
| void vt::Database::computeVector | ( | const std::vector< Word > & | document, | |
| DocumentVector & | v | |||
| ) | const [private] |
Definition at line 99 of file database.cpp.
| void vt::Database::find | ( | const std::vector< Word > & | document, | |
| size_t | N, | |||
| std::vector< Match > & | matches | |||
| ) | const |
Find the top N matches in the database for the query document.
| document | The query document, a set of quantized words. | |
| N | The number of matches to return. | |
| [out] | matches | IDs and scores for the top N matching database documents. |
Definition at line 32 of file database.cpp.
| DocId vt::Database::findAndInsert | ( | const std::vector< Word > & | document, | |
| size_t | N, | |||
| std::vector< Match > & | matches | |||
| ) |
Find the top N matches, then insert the query document.
This is equivalent to calling find() followed by insert(), but may be more efficient.
| document | The document to match then insert, a set of quantized words. | |
| N | The number of matches to return. | |
| [out] | matches | IDs and scores for the top N matching database documents. |
Definition at line 53 of file database.cpp.
Insert a new document.
| document | The set of quantized words in a document/image. |
Definition at line 10 of file database.cpp.
| void vt::Database::loadWeights | ( | const std::string & | file | ) |
Load the vocabulary word weights from a file.
Definition at line 81 of file database.cpp.
| void vt::Database::normalize | ( | DocumentVector & | v | ) | [static, private] |
Definition at line 108 of file database.cpp.
| void vt::Database::saveWeights | ( | const std::string & | file | ) | const |
Save the vocabulary word weights to a file.
Definition at line 73 of file database.cpp.
| float vt::Database::sparseDistance | ( | const DocumentVector & | v1, | |
| const DocumentVector & | v2 | |||
| ) | [static, private] |
Definition at line 118 of file database.cpp.
std::vector<DocumentVector> vt::Database::database_vectors_ [private] |
Definition at line 112 of file database.h.
std::vector<InvertedFile> vt::Database::word_files_ [private] |
Definition at line 110 of file database.h.
std::vector<float> vt::Database::word_weights_ [private] |
Definition at line 111 of file database.h.