Right way to iterate through multiple analyses with java API

Submitted by Alexander G on Tue, 10/16/2018 - 13:55

Buenos días,

I'm using Freeling to parse certain languages with java API. (As a test to compare to other POS-taggers.) I don't have final results for my tests yet, but Freeling is a great library for sure :)

I'm facing a problem while iterating through multiple analyses for ambiguous words. E.g., if we look at this code : is it OK to call w.getLemma(n) + " " + w.getTag(n) for n-ambiguity words? Actually, I get std::bad_alloc after calling w.getLemma(1) even if (w.getNAnalysis() > 0).

The only way I found is pretty ugly :

ListAnalysis anaList = word.getAnalysis();
while (anaList.size() > 0) {
    Analysis analysis = anaList.getLast();
    String posMarkup = analysis.getTag();
    // parse markup here

while(size()) + getLast/First + clearLast/First indeed? No, no :)

I wouldn't have this "problem" in c++, of course, but I have to use java for now. Is java API missing ListAnalysisIterator? Or am I missing something?

Thanks in advance for the answer.

To traverse the list of possible analysis, you need to use an iterator.
ListAnalysisIterator should be available in the API.
You can inspect the generated API in APIs/java/edu/upc/Jfreeling in your build directory

There is no direct acces to n-th analysis of a word (I can't think of a use case for that).

analysis are in a std::list. If you want the last element, you can use method list::back() to access the last element.  SWIG takes care to include this method in the generated java API. In principle, SWIG interfaces, all methods provided by std::list. 

Important:  get_lemma(n) and get_tag(n) do NOT return the n-th lemma (or tag) for this word.
They return the lemma (or tag) for the word analysis selected by the tagger in the n-th best sequence. 
It only makes sense to use those methods after running the HMM PoS tagger, and only if it has been instantiated with the option to produce the n most likely tag sequences.

Thanks a lot for the answer!

Currently there's no ListAnalysisIterator in my APIs/java/edu/upc/Jfreeling folder :

$ grep -r ListAnalysisIterator APIs/java/edu/upc/freeling/|wc -l

Maybe I should check other versions of Freeling... or add the iterator myself :)

I assumed you are using last version (4.1) and that you built the API with the appropriate cmake option.  If you have an older one, it maybe that the API is less complete

See the user manual for more details: