Freeling tagset, some values are not documented

Submitted by David on Fri, 10/13/2017 - 13:37
Forums

Hello,
We have found for some french documents we got the tag PPSCNN0, which is not documented
since the value 'S' not exist 'Person' in the category 'Pronoun' (P) according to
https://talp-upc.gitbooks.io/freeling-user-manual/content/tagsets/tagse…

The same happens for the tag DF0CN0, 'F' is not in the possible type values for a
determiner.

Aside from those problems, where can I see the complete set of
possible tags for a language?
Is this documented somewhere?
Regards.

Those are errors in the dictionary. I fixem them in the repository

Regarding the list of possible tags, it is not documented because it is huge (about 300 tags).

However, the documentation provides the needed information to generate them all (the documentation is generated from tagset description files, eg, https://github.com/TALP-UPC/FreeLing/blob/master/data/fr/tagset.dat . Details on these files can be found in the documentation:  https://talp-upc.gitbooks.io/freeling-user-manual/content/modules/tagset.html

Alternatively, you can extract all tags present in the dictionary with a simple shell command:
cat data/fr/dictionary/entries/* | cut -d' ' -f3 | sort | uniq

Yes, the demo is not updated automatically with every change at the master source.

In the next update, the demo will be fixed too.

Meanwhile, you can get the fixed files from github