Differences between FreeLing 3 and 4?

Submitted by koichi on Mon, 05/23/2016 - 11:41
Forums

We can see the new features of FreeLing 4 here:
http://nlp.lsi.upc.edu/freeling/node/27

But is there any other significant changes in specifications of FreeLing 3 and 4? Can we use FreeLing 4 as a drop in replacement for FreeLing 3?

Many Thanks,

There are significant changes, thought the impact will depend on how do you use freeling and/or its output.

If you call the library from your own program, the API has important changes. You will need to adapt some of your code and recompile.

Regarding output produced by FreeLing, several tagsets have been changed, so some words may receive different tags than in 3.1 (e.g. possessive adjectives for most latin languages, prepositions in some languages, etc)

I use “analyzer” to perform tokenization, morphological analysis and PoS tagging mainly. Target languages are CA, EN, ES, FR, IT, PT, RU and DE.

So I have to closely check “FreeLing Tagset Description” section of the manual.
https://talp-upc.gitbooks.io/freeling-user-manual/content/tagsets.html

If you have any more advices to people who switch from FreeLing 3 to 4, please let us know.

Thanks!

Yes, it is a good idea to check the tagsets.

You can compare the file "tagset.dat" in each language data directory with the same file in 3.1 to have a fine view of the changes.
File format is described in https://talp-upc.gitbooks.io/freeling-user-manual/content/modules/tagse…

If you use parsing, you may be interested in checking the results of the new statistical parsers. They are slower, but provide more robust analysis, specially in long sentences.