Differences in dependenency parses across outlvs

Submitted by maniol on Mon, 05/31/2021 - 11:46
Forums

Hello, I've been observing some inconsistencies wrt dependecy parser. I obtain different dependency parses in "dep" output level and in "semgraph" output level with Freeling 4.2 and lstm dep parser.
Frase: "atm withdrawal five hundred dollars on june seventeenth"
--outlvl dep

"dependencies" : [
{"token" : "t1.2", "function" : "ROOT", "word" :"withdrawal", "children" : [
{"token" : "t1.1", "function" : "NMOD", "word" : "atm"},
{"token" : "t1.3", "function" : "ADV", "word" : "five_hundred_dollars"},
{"token" : "t1.4", "function" : "TMP", "word" : "on", "children" : [
{"token" : "t1.5", "function" : "PMOD", "word" : "june_seventeenth"}
]}

--outlvl semgraph

"dependencies" : [
{"token" : "t1.3", "function" : "ROOT", "word" : "five_hundred_dollars", "children" : [
{"token" : "t1.2", "function" : "NMOD", "word" : "withdrawal", "children" : [
{"token" : "t1.1", "function" : "NMOD", "word" : "atm"}
]},
{"token" : "t1.4", "function" : "LOC", "word" : "on", "children" : [
{"token" : "t1.5", "function" : "PMOD", "word" : "june_seventeenth"}
]}
]}]

Also, this same sentence yields yet another dep parse in the online demo (with semgraph outout level):

dependencies:
depnode function="ROOT" token="t1.2" word="withdrawal"
depnode function="NMOD" token="t1.1" word="atm"/
depnode function="NMOD" token="t1.3" word="five_hundred_dollars"
depnode function="TMP" token="t1.4" word="on"
depnode function="PMOD" token="t1.5" word="june_seventeenth"

The demo parse is the most correct parse.
Could you please explain how I can reproduce the online demo pipeline locally? What parameters should I set to what values so that I can obtain the same parse locally?
Thank you.

FreeLing version running in the demo is not necessarily 4.2, it may be some later development version, so that explains one of the differences.

But if you set --dep lstm in both output levels, the tree should be the same.
This may be some bug in 4.2 maybe fixed in the development version.

I'll check if I can reproduce the problem in master branch and if so, try to find out the cause.


 

 

I found the problem.  There was a bug in the "analyzer" program that affected the interaction between the selected output level and the parser to be used. For "semgraph", it ended up not using the lstm parser, but treeler.

I fixed the option management, so it will work now.  You can replace your main program in:

src/main/sample_analyzer/main.cc

with the one in master
https://github.com/TALP-UPC/FreeLing/blob/master/src/main/sample_analyzer/main.cc

then "make install" again, and it should work.  Let me know if it doesn't