Usage

Description

Questions and aswers related to FreeLing usage

How to analyze an already tokenized file

Submitted by andres on Tue, 11/07/2017 - 20:40
Forums

I have already tried:

analyze --inplv token -f ca.cfg < orig.txt > target.txt
analyze --inplv splitted -f ca.cfg < orig.txt > target.txt

And also changing at the ca.cfg file the original "InputLevel=text" to "InputLevel=token"
But it alwasys says:
Error - 'text' input format only accepts input analysis level 'text'.

Freeling tagset, some values are not documented

Submitted by David on Fri, 10/13/2017 - 13:37
Forums

Hello,
We have found for some french documents we got the tag PPSCNN0, which is not documented
since the value 'S' not exist 'Person' in the category 'Pronoun' (P) according to
https://talp-upc.gitbooks.io/freeling-user-manual/content/tagsets/tagse…

The same happens for the tag DF0CN0, 'F' is not in the possible type values for a
determiner.

Language detection

Submitted by andres on Fri, 07/21/2017 - 10:17
Forums

Hi there, at the old forum you answered a question on this topic with the code below, so my question is, where do I get the analyzer.php file, or.... can you share the code of your demo version....
I tried this code already unsuccessfully though..

Thanks!

<code>
include("analyzer.php");

// Adjust this path to your local FreeLing installation
$FL_DIR = "/usr/local";

train-nerc

Submitted by tmyapple on Tue, 04/25/2017 - 06:52
Forums

Hi,
I'm trying to train a NER model using the train-nerc directory and the demo data that exist in train-nerc/corpus
I'm following the scripts, and encountered several problems:
1. corpus/bin/extract-gaz.sh (is being called from prepare-corpus) - is stuck inside the second loop, looks like the achieved ratio doesn't achieve the goal ratio - i suspect that it is due to the small volume of the demo set. Should it work fine and something else is wrong on my side?

pt NER extraction

Submitted by break on Wed, 04/12/2017 - 16:57
Forums

Hello Freeling friends, i have a little problem running NER / NEC.
I'm trying to get detections about persons, organizations, places, dates etc, with a given plain text. As far as i know the text must be sentence splited (?not sure) and tokenized (one word per line), however i'm trying to run analyze with a plain text that have nouns, places, organizations etc, and see whats the output, but i got an error on np.dat file, maybe i'm calling it wrongly.
The config file has NER and NEC enabled:

bash vs. java socket client

Submitted by flopezbello on Sat, 04/08/2017 - 16:41
Forums

Hello Lluis. I'm working on an integration between KNIME and Freeling. I believe this one can be a powerful combination, and look forward to make them talk as smoothly as possible.

As such, I've been trying a couple of approaches on a first stage which I would like to comment. I will really appreciate your feedback:

Basic NER Ignore list

Submitted by lgarcias on Thu, 02/16/2017 - 09:31
Forums

I am using the basic NER module in Spanish and I have added some words to the Ignore list followed by 0/1 as specified in the user manual. Still when running it this list is being ignored by the program and the items in them appear tagged as NEs.
I have specified in the .cfg that I am using the basic NER. I have also tried both putting a space or a tab between the words and the 0/1 in case that was the problem (it is not specified in the manual), but it is still not working.
Has anybody else had this problem? What can I be doing wrong?

Thank you in advance.