FreeLing
4.0
|
The class np implements a simple proper noun recognizer. More...
#include <np.h>
Public Member Functions | |
np (const std::wstring &) | |
Constructor. | |
Private Member Functions | |
int | ComputeToken (int, sentence::iterator &, sentence &) const |
Compute the right token code for word j from given state. | |
void | ResetActions (ner_status *) const |
Reset flag about capitalized noun at sentence start. | |
void | StateActions (int, int, int, sentence::const_iterator, ner_status *) const |
Perform necessary actions in "state" reached from state "origin" via word j interpreted as code "token": Basically, set flag about capitalized noun at sentence start. | |
void | SetMultiwordAnalysis (sentence::iterator, int, const ner_status *) const |
Set the appropriate lemma and tag for the new multiword. | |
Private Attributes | |
std::set< std::wstring > | func |
set of function words | |
std::set< std::wstring > | punct |
set of special punctuation tags | |
std::set< std::wstring > | names |
set of words to be considered possible NPs at sentence beggining | |
std::map< std::wstring, int > | ignore_tags |
set of words/tags to be ignored as NE parts, even if they are capitalized | |
std::map< std::wstring, int > | ignore_words |
std::set< std::wstring > | prefixes |
sets of NE affixes | |
std::set< std::wstring > | suffixes |
freeling::regexp | RE_NounAdj |
freeling::regexp | RE_Closed |
freeling::regexp | RE_DateNumPunct |
The class np implements a simple proper noun recognizer.
freeling::np::np | ( | const std::wstring & | npFile | ) |
Constructor.
Create a proper noun recognizer.
References freeling::config_file::add_section(), freeling::config_file::close(), ERROR_CRASH, freeling::automat< ner_status >::Final, func, freeling::config_file::get_content_line(), freeling::config_file::get_section(), ignore_tags, ignore_words, freeling::automat< ner_status >::initialState, freeling::util::lowercase(), MAX_STATES, MAX_TOKENS, names, freeling::config_file::open(), prefixes, punct, RE_Closed, RE_DateNumPunct, freeling::util::RE_is_capitalized, RE_NounAdj, ST_FUN, ST_IN, ST_NP, ST_PREF, ST_STOP, ST_SUF, freeling::automat< ner_status >::stopState, suffixes, TK_mFun, TK_mPref, TK_mSuf, TK_mUpper, TK_sNounUpp, TK_sUnkUpp, TRACE, freeling::automat< ner_status >::trans, and WARNING.
int freeling::np::ComputeToken | ( | int | state, |
sentence::iterator & | j, | ||
sentence & | se | ||
) | const [private, virtual] |
Compute the right token code for word j from given state.
Reimplemented from freeling::ner_module.
References ignore_tags, ignore_words, punct, and TK_other.
void freeling::np::ResetActions | ( | ner_status * | st | ) | const [private, virtual] |
Reset flag about capitalized noun at sentence start.
Reimplemented from freeling::ner_module.
References freeling::ner_status::initialNoun.
void freeling::np::SetMultiwordAnalysis | ( | sentence::iterator | i, |
int | fstate, | ||
const ner_status * | st | ||
) | const [private, virtual] |
Set the appropriate lemma and tag for the new multiword.
Reimplemented from freeling::ner_module.
References freeling::ner_status::initialNoun, and TRACE.
void freeling::np::StateActions | ( | int | origin, |
int | state, | ||
int | token, | ||
sentence::const_iterator | j, | ||
ner_status * | st | ||
) | const [private, virtual] |
Perform necessary actions in "state" reached from state "origin" via word j interpreted as code "token": Basically, set flag about capitalized noun at sentence start.
Reimplemented from freeling::ner_module.
References freeling::ner_status::initialNoun, int2wstring, ST_NP, TK_sNounUpp, and TRACE.
std::set<std::wstring> freeling::np::func [private] |
set of function words
Referenced by np().
std::map<std::wstring,int> freeling::np::ignore_tags [private] |
set of words/tags to be ignored as NE parts, even if they are capitalized
Referenced by ComputeToken(), and np().
std::map<std::wstring,int> freeling::np::ignore_words [private] |
Referenced by ComputeToken(), and np().
std::set<std::wstring> freeling::np::names [private] |
set of words to be considered possible NPs at sentence beggining
Referenced by np().
std::set<std::wstring> freeling::np::prefixes [private] |
sets of NE affixes
Referenced by np().
std::set<std::wstring> freeling::np::punct [private] |
set of special punctuation tags
Referenced by ComputeToken(), and np().
freeling::regexp freeling::np::RE_Closed [private] |
Referenced by np().
Referenced by np().
freeling::regexp freeling::np::RE_NounAdj [private] |
Referenced by np().
std::set<std::wstring> freeling::np::suffixes [private] |
Referenced by np().