Forums
I am using the basic NER module in Spanish and I have added some words to the Ignore list followed by 0/1 as specified in the user manual. Still when running it this list is being ignored by the program and the items in them appear tagged as NEs.
I have specified in the .cfg that I am using the basic NER. I have also tried both putting a space or a tab between the words and the 0/1 in case that was the problem (it is not specified in the manual), but it is still not working.
Has anybody else had this problem? What can I be doing wrong?
Thank you in advance.
Please provide details
It would help to determine the problem if you provide more information:
- Which FreeLing version are you using.
- An example of words you added to the ignore list.
- A sentence that you expected to work after adding them and it did not.
Some details
Sorry for the lack of information.
I am using FreeLing 4.0
A sample of the words I added include "usb 1, dns 1, vpn 1"
An example sentence on which I expected it to work is "Me trajo Iñaki el USB que había pedido Pedro". Instead of ignoring "USB" it identifies "Iñaki el USB" as a person type entity.
Thank you.
It works for me
I added to "np.dat"
<Ignore>
usb 1
dvd 1
</Ignore>
And then:
~$ analyze -f es.cfg
Me trajo Iñaki el USB que había pedido Pedro.
Me me PP1CS00 0.755196
trajo traer VMIS3S0 1
Iñaki iñaki NP00000 1
el el DA0MS0 1
USB usb NCMS000 0.869744
que que PR0CN00 0.550139
había haber VAII3S0 0.499853
pedido pedir VMP00SM 0.963576
Pedro pedro NP00000 1
. . Fp 1
Me trajo Iñaki el DVD que había pedido Pedro.
Me me PP1CS00 0.755196
trajo traer VMIS3S0 1
Iñaki iñaki NP00000 1
el el DA0MS0 1
DVD dvd NCMS000 1
que que PR0CN00 0.550139
había haber VAII3S0 0.499853
pedido pedir VMP00SM 0.963576
Pedro pedro NP00000 1
. . Fp 1
Me trajo Iñaki el DNS que había pedido Pedro.
Me me PP1CS00 0.755196
trajo traer VMIS3S0 1
Iñaki_el_DNS iñaki_el_dns NP00000 1
que que PR0CN00 0.550139
había haber VAII3S0 0.499853
pedido pedir VMP00SM 0.963576
Pedro pedro NP00000 1
. . Fp 1
So, it works with USB and DVD, which were the words added to np.dat.
Things you may be doing wrong:
1.- The format of the "Ignore" section in the np.dat file:
- "Ignore" needs to be written exactly like this (all lowercase except first "I")
- there must be one word per line, no commas
2.- You need to change the "np.dat" that is actually being used.
E.g. by default, files in installation directory are used (e.g. /usr/local/share/freeling/es/np.dat). If you modify the file in source code (e.g. mysources/freeling/data/es/np.dat) that will have no effect (unless you use option --fnp to specifiy which file should be used).
Solved
You are right, I was not using the option --fnp. It works perfectly now, thanks a lot!