FreeLing  4.0
Classes | Public Types | Public Member Functions | Private Member Functions | Private Attributes
freeling::splitter Class Reference

Class splitter implements a sentence splitter, which accumulates lists of words until a sentence is completed, and then returns a list of sentence objects. More...

#include <splitter.h>

Collaboration diagram for freeling::splitter:
Collaboration graph
[legend]

List of all members.

Classes

class  splitter_status

Public Types

typedef splitter_statussession_id

Public Member Functions

 splitter (const std::wstring &splitfile)
 Constructor, given option file.
 ~splitter ()
 Destructor.
session_id open_session () const
 open a splitting session, get session id
void close_session (session_id ses) const
 close splitting session
void split (session_id ses, const std::list< word > &lw, bool flush, std::list< sentence > &ls) const
 Add given list of words to the buffer, and put complete sentences that can be build into ls.
std::list< sentencesplit (session_id ses, const std::list< word > &ls, bool flush) const
 same than previous method, but result sentences are returned.
void split (const std::list< word > &lw, std::list< sentence > &ls) const
 Sessionless splitting.
std::list< sentencesplit (const std::list< word > &ls) const
 Sessionless splitting, return a copy of the sentences.

Private Member Functions

bool end_of_sentence (std::list< word >::const_iterator, const std::list< word > &) const
 check for sentence markers

Private Attributes

bool SPLIT_AllowBetweenMarkers
 configuration options
int SPLIT_MaxWords
std::set< std::wstring > starters
 Sentence delimiters.
std::map< std::wstring, bool > enders
std::map< std::wstring, intmarkers
 Open-close marker pairs (parenthesis, etc)

Detailed Description

Class splitter implements a sentence splitter, which accumulates lists of words until a sentence is completed, and then returns a list of sentence objects.


Member Typedef Documentation


Constructor & Destructor Documentation

freeling::splitter::splitter ( const std::wstring &  splitfile)

Destructor.

Desctructor.


Member Function Documentation

close splitting session

Close a session.

References int2wstring, and TRACE.

bool freeling::splitter::end_of_sentence ( std::list< word >::const_iterator  ,
const std::list< word > &   
) const [private]

check for sentence markers

Check whether a word is a sentence end (eg a dot followed by a capitalized word).

open a splitting session, get session id

Open a session, and create a copy of the internal status for it.

Sessions are needed in case the same splitter is used to split different files simultaneously (either by the same thread or by different threads

References freeling::splitter::splitter_status::betweenMrk, int2wstring, freeling::splitter::splitter_status::no_split_count, freeling::splitter::splitter_status::nsentence, and TRACE.

void freeling::splitter::split ( session_id  ses,
const std::list< word > &  lw,
bool  flush,
std::list< sentence > &  ls 
) const

Add given list of words to the buffer, and put complete sentences that can be build into ls.

The boolean states if a buffer flush has to be forced (true) or some words may remain in the buffer (false) if the splitter needs to wait to see what is coming next. Each thread using the same splitter needs to open a new session.

std::list<sentence> freeling::splitter::split ( session_id  ses,
const std::list< word > &  ls,
bool  flush 
) const

same than previous method, but result sentences are returned.

void freeling::splitter::split ( const std::list< word > &  lw,
std::list< sentence > &  ls 
) const

Sessionless splitting.

Fill given list<sentece>

std::list< sentence > freeling::splitter::split ( const std::list< word > &  ls) const

Sessionless splitting, return a copy of the sentences.


Member Data Documentation

std::map<std::wstring,bool> freeling::splitter::enders [private]
std::map<std::wstring,int> freeling::splitter::markers [private]

Open-close marker pairs (parenthesis, etc)

configuration options

std::set<std::wstring> freeling::splitter::starters [private]

Sentence delimiters.


The documentation for this class was generated from the following files: