FreeLing  4.0
Public Member Functions | Private Member Functions | Private Attributes
freeling::hmm_tagger Class Reference

The class hmm_tagger implements the syntactic analyzer and is the main class, which uses all the others. More...

#include <hmm_tagger.h>

Inheritance diagram for freeling::hmm_tagger:
Inheritance graph
[legend]
Collaboration diagram for freeling::hmm_tagger:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 hmm_tagger (const std::wstring &, bool, unsigned int, unsigned int kb=1)
 Constructor.
 ~hmm_tagger ()
 Destructor.
void annotate (sentence &) const
 analyze given sentence
double SequenceProb_log (const sentence &, int k=0) const
 Given an *annotated* sentence, compute (log) probability of k-th best sequence according to HMM parameters.

Private Member Functions

bool is_forbidden (const std::wstring &, sentence::const_iterator) const
 check if a trigram is in the forbidden list.
double ProbA_log (const bigram &, const bigram &, sentence::const_iterator) const
 Compute transition log_probability from state_i to state_j, returning appropriate smoothed values if no evidence is available.
double ProbB_log (const bigram &, const word &) const
 Compute emission log_probability for observation obs from state_i.
double ProbPi_log (const bigram &) const
 Compute initial log_probability for state_i.
std::list< emission_statesFindStates (const sentence &) const
 compute possible emission states for each word in sentence.

Private Attributes

const tagsetTags
std::map< std::wstring, double > PTag
 maps to store the probabilities
std::map< bigram, double > PBg
std::map< std::wstring, double > PTrg
std::map< bigram, double > PInitial
std::map< std::wstring, double > PWord
std::multimap< std::wstring,
std::wstring > 
Forbidden
 set of hand-specified forbidden bigram and trigram transitions
double probInitial
double probUnobserved
safe_map< std::wstring, double > * pA_cache
 thread-safe probabilitiy cache, to speed up computations
unsigned int kbest
 number of best paths to compute
double c [3]
 coeficients to compute linear interpolation

Detailed Description

The class hmm_tagger implements the syntactic analyzer and is the main class, which uses all the others.


Constructor & Destructor Documentation

freeling::hmm_tagger::hmm_tagger ( const std::wstring &  hmmFile,
bool  rtk,
unsigned int  force,
unsigned int  kb = 1 
)

Destructor.

References pA_cache, and Tags.


Member Function Documentation

void freeling::hmm_tagger::annotate ( sentence se) const [virtual]

analyze given sentence

Disambiguate given sentences with provided options.

Implements freeling::POS_tagger.

References double2wstring, FindStates(), freeling::tagset::get_short_tag(), int2wstring, kbest, ProbA_log(), ProbB_log(), ProbPi_log(), Tags, TRACE, and freeling::trellis::ZERO_logprob.

list< emission_states > freeling::hmm_tagger::FindStates ( const sentence sent) const [private]

compute possible emission states for each word in sentence.

Obtain a list with the states that *may* have emmited current observation (a sentence).

References freeling::tagset::get_short_tag(), Tags, and TRACE.

Referenced by annotate().

bool freeling::hmm_tagger::is_forbidden ( const std::wstring &  ,
sentence::const_iterator   
) const [private]

check if a trigram is in the forbidden list.

References Forbidden, freeling::tagset::get_short_tag(), Tags, TRACE, vector2wstring, and wstring2vector.

Referenced by ProbA_log().

double freeling::hmm_tagger::ProbA_log ( const bigram state_i,
const bigram state_j,
sentence::const_iterator  w 
) const [private]

Compute transition log_probability from state_i to state_j, returning appropriate smoothed values if no evidence is available.

If the trigram is in the "forbidden" list, result is probability zero.

References c, double2wstring, safe_map< T1, T2 >::find_safe(), safe_map< T1, T2 >::insert_safe(), is_forbidden(), pA_cache, PBg, PTag, PTrg, and TRACE.

Referenced by annotate(), and SequenceProb_log().

double freeling::hmm_tagger::ProbB_log ( const bigram state_i,
const word obs 
) const [private]

Compute emission log_probability for observation obs from state_i.

Pb=P(word|state)=P(state|word)*P(word)/P(state) Since states are bigrams: s=t1.t2

  • we approximate P(s)~=P(t2)
  • we approximate P(s|w)~=P(t2|w) Thus: Pb ~= P(t2|w)*P(w)/P(t2)

References double2wstring, freeling::word::get_lc_form(), freeling::tagset::get_short_tag(), probUnobserved, PTag, PWord, Tags, and TRACE.

Referenced by annotate(), and SequenceProb_log().

double freeling::hmm_tagger::ProbPi_log ( const bigram state_i) const [private]

Compute initial log_probability for state_i.

References PInitial, probInitial, and freeling::trellis::ZERO_logprob.

Referenced by annotate(), and SequenceProb_log().

double freeling::hmm_tagger::SequenceProb_log ( const sentence se,
int  k = 0 
) const

Given an *annotated* sentence, compute (log) probability of k-th best sequence according to HMM parameters.

Given an *annotated* sentence, compute sequence (log) probability according to HMM parameters.

References freeling::tagset::get_short_tag(), ProbA_log(), ProbB_log(), ProbPi_log(), and Tags.


Member Data Documentation

double freeling::hmm_tagger::c[3] [private]

coeficients to compute linear interpolation

Referenced by hmm_tagger(), and ProbA_log().

std::multimap<std::wstring, std::wstring> freeling::hmm_tagger::Forbidden [private]

set of hand-specified forbidden bigram and trigram transitions

Referenced by hmm_tagger(), and is_forbidden().

unsigned int freeling::hmm_tagger::kbest [private]

number of best paths to compute

Referenced by annotate(), and hmm_tagger().

safe_map<std::wstring,double>* freeling::hmm_tagger::pA_cache [private]

thread-safe probabilitiy cache, to speed up computations

Referenced by hmm_tagger(), ProbA_log(), and ~hmm_tagger().

std::map<bigram, double> freeling::hmm_tagger::PBg [private]

Referenced by hmm_tagger(), and ProbA_log().

std::map<bigram, double> freeling::hmm_tagger::PInitial [private]

Referenced by hmm_tagger(), and ProbPi_log().

Referenced by hmm_tagger(), and ProbPi_log().

Referenced by hmm_tagger(), and ProbB_log().

std::map<std::wstring, double> freeling::hmm_tagger::PTag [private]

maps to store the probabilities

Referenced by hmm_tagger(), ProbA_log(), and ProbB_log().

std::map<std::wstring, double> freeling::hmm_tagger::PTrg [private]

Referenced by hmm_tagger(), and ProbA_log().

std::map<std::wstring, double> freeling::hmm_tagger::PWord [private]

Referenced by hmm_tagger(), and ProbB_log().


The documentation for this class was generated from the following files: