FreeLing  4.0
Public Member Functions | Private Member Functions | Private Attributes
freeling::dictionary Class Reference

The class dictionary implements dictionary search and suffix analysis for word forms. More...

#include <dictionary.h>

Inheritance diagram for freeling::dictionary:
Inheritance graph
[legend]
Collaboration diagram for freeling::dictionary:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 dictionary (const std::wstring &Lang, const std::wstring &dicFile, const std::wstring &sufFile, const std::wstring &compFile, bool invDic=false, bool retok=true)
 Constructor.
 ~dictionary ()
 Destructor.
void add_analysis (const std::wstring &, const analysis &)
 add analysis to dictionary entry (create entry if not there)
void remove_entry (const std::wstring &)
 remove entry from dictionary
void set_retokenize_contractions (bool)
 customize behaviour of dictionary for further analysis
void set_affix_analysis (bool)
 customize behaviour of dictionary for further analysis
void set_compound_analysis (bool)
 customize behaviour of dictionary for further analysis
bool has_affixes () const
 find out whether the dictionary has loaded an affix module
bool has_compounds () const
 find out whether the dictionary has loaded a compounds module
void search_form (const std::wstring &, std::list< analysis > &) const
 Get dictionary entry for a given form, add to given list.
bool annotate_word (word &, std::list< word > &, bool override=false) const
 Fills the analysis list of a word, checking for suffixes and contractions.
void annotate_word (word &) const
 Fills the analysis list of a word, checking for suffixes and contractions.
std::list< std::wstring > get_forms (const std::wstring &, const std::wstring &) const
 Get possible forms for a lemma+pos.
void dump_dictionary (std::wostream &, bool keysonly=false) const
 dump dictionary to a buffer. Either full entries or keys only
void analyze (sentence &) const
 analyze given sentence

Private Member Functions

bool check_contracted (const std::wstring &, std::wstring, std::wstring, std::list< word > &) const
 check whether the word is a contraction, and if so, fill the list with the contracted words
std::list< std::wstring > tag_combinations (std::list< std::wstring >::const_iterator, std::list< std::wstring >::const_iterator) const
 Generate valid tag combinations for an ambiguous contraction.
bool parse_dict_entry (const std::wstring &, std::list< std::pair< std::wstring, std::list< std::wstring > > > &) const
 parse data string into a map lemma->list of tags
std::wstring compact_data (const std::list< std::pair< std::wstring, std::list< std::wstring > > > &) const
 compact data in format lema1 pos1a|pos1b|pos1c lema2 pos2a|posb to save memory

Private Attributes

bool InverseDict
 configuration options
bool RetokenizeContractions
 Analysis settings.
bool AffixAnalysis
bool CompoundAnalysis
affixessuf
 suffix analyzer
compoundscomp
 compounds analyzer
databasemorfodb
 key-value file or hash
databaseinverdb

Detailed Description

The class dictionary implements dictionary search and suffix analysis for word forms.


Constructor & Destructor Documentation

freeling::dictionary::dictionary ( const std::wstring &  Lang,
const std::wstring &  dicFile,
const std::wstring &  sufFile,
const std::wstring &  compFile,
bool  invDic = false,
bool  retok = true 
)

Destructor.

Destroy dictionary module, close database.


Member Function Documentation

void freeling::dictionary::add_analysis ( const std::wstring &  form,
const analysis newan 
)

add analysis to dictionary entry (create entry if not there)

References freeling::analysis::get_lemma(), freeling::analysis::get_tag(), freeling::LEMMA_DIVIDER, list2wstring, freeling::TAG_DIVIDER, and wstring2list.

void freeling::dictionary::analyze ( sentence se) const [virtual]

analyze given sentence

Dictionary search and affix analysis for all words in a sentence, using given options.

Implements freeling::processor.

References int2wstring, freeling::sentence::rebuild_word_index(), TRACE, and TRACE_SENTENCE.

bool freeling::dictionary::annotate_word ( word ,
std::list< word > &  ,
bool  override = false 
) const

Fills the analysis list of a word, checking for suffixes and contractions.

Returns true iff the form is a contraction, returns contraction components in given list

Referenced by freeling::compounds::check_compound().

Fills the analysis list of a word, checking for suffixes and contractions.

Search form in the dictionary.

Never retokenizing contractions, nor returning component list. It is just a convenience equivalent to "annotate_word(w,dummy,true)"

Add* found analysis to the given word. Do not retokenize contractions, nor return a component list.

bool freeling::dictionary::check_contracted ( const std::wstring &  ,
std::wstring  ,
std::wstring  ,
std::list< word > &   
) const [private]

check whether the word is a contraction, and if so, fill the list with the contracted words

Check whether the given word is a contraction, if so, obtain composing words (and store them into lw).

References freeling::word::add_analysis(), ERROR_CRASH, freeling::word::get_n_analysis(), list2wstring, TRACE, and wstring2list.

wstring freeling::dictionary::compact_data ( const std::list< std::pair< std::wstring, std::list< std::wstring > > > &  ) const [private]

compact data in format lema1 pos1a|pos1b|pos1c lema2 pos2a|posb to save memory

References freeling::LEMMA_DIVIDER, list2wstring, and freeling::TAG_DIVIDER.

void freeling::dictionary::dump_dictionary ( std::wostream &  buff,
bool  keysonly = false 
) const

dump dictionary to a buffer. Either full entries or keys only

dump dictionary to a buffer.

Either full entries or keys only

Referenced by freeling::compounds::compounds().

list< wstring > freeling::dictionary::get_forms ( const std::wstring &  ,
const std::wstring &   
) const

Get possible forms for a lemma+pos.

References WARNING, and wstring2list.

find out whether the dictionary has loaded an affix module

find out whether the dictionary has loaded a compounds module

bool freeling::dictionary::parse_dict_entry ( const std::wstring &  ,
std::list< std::pair< std::wstring, std::list< std::wstring > > > &   
) const [private]

parse data string into a map lemma->list of tags

References wstring2list.

void freeling::dictionary::remove_entry ( const std::wstring &  form)

remove entry from dictionary

References list2wstring, and wstring2list.

void freeling::dictionary::search_form ( const std::wstring &  s,
std::list< analysis > &  la 
) const

Get dictionary entry for a given form, add to given list.

Search form in the dictionary, according to given options, Add* found analysis to the given list.

References freeling::analysis::init(), int2wstring, freeling::LEMMA_DIVIDER, list2wstring, freeling::TAG_DIVIDER, TRACE, and wstring2list.

Referenced by freeling::affixes::CheckRetokenizable(), and freeling::affixes::SearchRootsList().

customize behaviour of dictionary for further analysis

customize behaviour of dictionary for further analysis

customize behaviour of dictionary for further analysis

list< wstring > freeling::dictionary::tag_combinations ( std::list< std::wstring >::const_iterator  ,
std::list< std::wstring >::const_iterator   
) const [private]

Generate valid tag combinations for an ambiguous contraction.

References wstring2list.


Member Data Documentation

compounds analyzer

configuration options

key-value file or hash

Analysis settings.

suffix analyzer


The documentation for this class was generated from the following files: