#include <dfa_min_hash.h>
Public Member Functions | |
void | clear () |
void | freeze () |
Get rid of the extra data used for maintaining the automaton minimality. | |
int | hash_value (const basic_string< char_type > &s) const |
Get the hash value for the word passed as a character string. | |
template<typename InputI > | |
int | hash_value (InputI first, InputI last) const |
Get the hash value for the word passed as a range of characters of type char_type . | |
bool | insert (const basic_string< char_type > &s) |
Add a word defined by a character string. | |
bool | insert (const basic_string< char_type > &s) |
Add a word defined by a character string. | |
template<class ForwardI > | |
bool | insert (ForwardI, ForwardI) |
Add a word defined by a range on a sequence of characters of type char_type . | |
unsigned int | size () const |
basic_string< char_type > | value_hash (int w) const |
Get the word for the hash value w as a character string. | |
template<typename OutputI > | |
int | value_hash (int w, OutputI out) const |
Get the word for the hash value w as a sequence of characters of type char_type written out through the output iterator out . | |
unsigned int | wc () const |
word count. |
The hashing is implemented with a dynamic minimal acyclic DFA and maps words to strictly positive integers in lexicographical order. This order is determined by the ordering defined on characters in the static method CharTraits::lt
. Therefore, the mapping is not stable with respect to the insertion: a new word will be associated to its position in the lexicographically-sorted word list, thus shifting by an offset of 1 all mappings beyond the insertion position. The set of words must then be fixed for this hashing function to be usable.
This class does not define the standard DFA interface but a restricted one that allows only word insertions and read-only operations thus enforcing the invariants for the structure integrity to be guaranteed.
Parameter | Description | Default | Requirements |
---|---|---|---|
CharTraits | Character traits describing the alphabet that the words are built with. | plain | CharTraits is a model of Alphabet |
void clear | ( | ) | [inherited] |
wc() == 0
void freeze | ( | ) | [inherited] |
Get rid of the extra data used for maintaining the automaton minimality.
Once frozen, words can still be added but the next insertion will take more time to rebuild the necessary data structure that had been disposed of.
int hash_value | ( | const basic_string< char_type > & | s | ) | const |
Get the hash value for the word passed as a character string.
References DFA_min_hash::hash_value().
int hash_value | ( | InputI | first, | |
InputI | last | |||
) | const |
Get the hash value for the word passed as a range of characters of type char_type
.
Referenced by DFA_min_hash::hash_value().
bool insert | ( | const basic_string< char_type > & | s | ) | [inherited] |
Add a word defined by a character string.
true
if the word has been actually added, false
if the word already exists and that the operation left the automaton unchanged. bool insert | ( | const basic_string< char_type > & | s | ) |
Add a word defined by a character string.
true
if the word has been actually added, false
if the word already exists and that the operation left the automaton unchanged. References DFA_min_hash::insert().
bool insert | ( | ForwardI | first, | |
ForwardI | last | |||
) |
Add a word defined by a range on a sequence of characters of type char_type
.
true
if the word has been actually added, false
if the word already exists and that the operation left the automaton unchanged. Reimplemented from DFA_min.
References stack_cursor::aim(), stack_cursor::backward(), stack_cursor::find(), stack_cursor::forward(), stack_cursor::letter(), stack_cursor::src(), and DFA_min::wc().
Referenced by DFA_min_hash::insert().
unsigned int size | ( | ) | const [inherited] |
wc()
basic_string<char_type> value_hash | ( | int | w | ) | const |
Get the word for the hash value w
as a character string.
w
is associated to. References DFA_min_hash::value_hash().
int value_hash | ( | int | w, | |
OutputI | out | |||
) | const |
Get the word for the hash value w
as a sequence of characters of type char_type
written out through the output iterator out
.
Referenced by DFA_min_hash::value_hash().
unsigned int wc | ( | ) | const [inherited] |
word count.
Referenced by DFA_min_hash::insert().