FreeLing  4.0
Public Member Functions | Private Member Functions | Private Attributes
freeling::probabilities Class Reference

Class probabilities sets lexical probabilities for each PoS tag of each word in a sentence. More...

#include <probabilities.h>

Inheritance diagram for freeling::probabilities:
Inheritance graph
[legend]
Collaboration diagram for freeling::probabilities:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 probabilities (const std::wstring &, double)
 Constructor.
 ~probabilities ()
 Destructor.
void annotate_word (word &) const
 Assign probabilities for each analysis of given word.
void set_activate_guesser (bool)
 Turn guesser on/of.
void analyze (sentence &) const
 Assign probabilities to tags for each word in sentence.

Private Member Functions

void smoothing (word &) const
 Smooth probabilities for the analysis of given word.
double compute_probability (const std::wstring &, double, const std::wstring &) const
 Compute p(tag|suffix) using recursively shorter suffixes.
double guesser (word &, double) const
 Guess possible tags, keeping some mass for previously assigned tags.
bool less (const analysis &a1, const analysis &a2) const
 compare two analysis to set the right order of preference
void sort_list (std::list< analysis > &ls) const
 sort given analysis list using lemma and pos preferences

Private Attributes

freeling::regexp RE_PunctNum
 Auxiliary regexps.
double ProbabilityThreshold
 Probability threshold for unknown words tags.
const tagsetTags
 Tagset description, to compute short versions of tags.
double BiassSuffixes
 Interpolation factor to favor suffix probabilities versus ambiguity-class probabilities when smoothing known but unobserved words.
double LidstoneLambdaLexical
 lambda parameter for smoothing via Lidstone's Law
double LidstoneLambdaClass
bool activate_guesser
 whether to use guesser for unknown words.
std::map< std::wstring, double > single_tags
 unigram probabilities
std::map< std::wstring,
std::map< std::wstring, double > > 
class_tags
 probabilities for usual ambiguity classes
std::map< std::wstring,
std::map< std::wstring, double > > 
lexical_tags
 lexical probabilities for frequent words
std::map< std::wstring, double > unk_tags
 list of tags and probabilities to assign to unknown words
std::map< std::wstring,
std::map< std::wstring, double > > 
unk_suffs
 list of tag frequencies for unknown word suffixes
double theeta
 unknown words suffix smoothing parameter;
std::wstring::size_type long_suff
 length of longest suffix
std::map< std::wstring,
std::wstring > 
lemma_prefs
std::map< std::wstring,
std::wstring > 
pos_prefs

Detailed Description

Class probabilities sets lexical probabilities for each PoS tag of each word in a sentence.


Constructor & Destructor Documentation

freeling::probabilities::probabilities ( const std::wstring &  probFile,
double  Threshold 
)

Destructor.

References Tags.


Member Function Documentation

void freeling::probabilities::analyze ( sentence se) const [virtual]

Assign probabilities to tags for each word in sentence.

Annotate probabilities for each analysis of each word in given sentence, using given options.

Implements freeling::processor.

References annotate_word(), and TRACE_SENTENCE.

double freeling::probabilities::compute_probability ( const std::wstring &  tag,
double  prob,
const std::wstring &  s 
) const [private]

Compute p(tag|suffix) using recursively shorter suffixes.

Compute probability of a tag given a word suffix.

References double2wstring, theeta, TRACE, and unk_suffs.

Referenced by guesser(), and smoothing().

double freeling::probabilities::guesser ( word w,
double  mass 
) const [private]
bool freeling::probabilities::less ( const analysis a1,
const analysis a2 
) const [private]

compare two analysis to set the right order of preference

References freeling::analysis::get_lemma(), freeling::analysis::get_prob(), freeling::analysis::get_tag(), lemma_prefs, and pos_prefs.

Referenced by sort_list().

Turn guesser on/of.

Turn guesser on/off.

References activate_guesser.

void freeling::probabilities::smoothing ( word w) const [private]

Smooth probabilities for the analysis of given word.

if using backoff, combine with suffix information to get better estimation

References BiassSuffixes, class_tags, compute_probability(), freeling::word::get_form(), freeling::word::get_lc_form(), freeling::word::get_n_analysis(), freeling::tagset::get_short_tag(), lexical_tags, LidstoneLambdaClass, LidstoneLambdaLexical, single_tags, Tags, and TRACE.

Referenced by annotate_word().

void freeling::probabilities::sort_list ( std::list< analysis > &  ls) const [private]

sort given analysis list using lemma and pos preferences

bubble sort given analysis list using given preferences

References less().

Referenced by annotate_word().


Member Data Documentation

whether to use guesser for unknown words.

Referenced by annotate_word(), probabilities(), and set_activate_guesser().

Interpolation factor to favor suffix probabilities versus ambiguity-class probabilities when smoothing known but unobserved words.

Referenced by probabilities(), and smoothing().

std::map<std::wstring,std::map<std::wstring,double> > freeling::probabilities::class_tags [private]

probabilities for usual ambiguity classes

Referenced by probabilities(), and smoothing().

std::map<std::wstring,std::wstring> freeling::probabilities::lemma_prefs [private]

Referenced by less(), and probabilities().

std::map<std::wstring,std::map<std::wstring,double> > freeling::probabilities::lexical_tags [private]

lexical probabilities for frequent words

Referenced by probabilities(), and smoothing().

Referenced by probabilities(), and smoothing().

lambda parameter for smoothing via Lidstone's Law

Referenced by probabilities(), and smoothing().

std::wstring::size_type freeling::probabilities::long_suff [private]

length of longest suffix

Referenced by probabilities().

std::map<std::wstring,std::wstring> freeling::probabilities::pos_prefs [private]

Referenced by less(), and probabilities().

Probability threshold for unknown words tags.

Referenced by guesser(), and probabilities().

Auxiliary regexps.

Referenced by annotate_word().

std::map<std::wstring,double> freeling::probabilities::single_tags [private]

unigram probabilities

Referenced by probabilities(), and smoothing().

Tagset description, to compute short versions of tags.

Referenced by guesser(), probabilities(), smoothing(), and ~probabilities().

unknown words suffix smoothing parameter;

Referenced by compute_probability(), and probabilities().

std::map<std::wstring,std::map<std::wstring,double> > freeling::probabilities::unk_suffs [private]

list of tag frequencies for unknown word suffixes

Referenced by compute_probability(), and probabilities().

std::map<std::wstring,double> freeling::probabilities::unk_tags [private]

list of tags and probabilities to assign to unknown words

Referenced by guesser(), and probabilities().


The documentation for this class was generated from the following files: