[solved] Can't compile on OpenBSD -CURRENT

Submitted by enricm on Tue, 03/19/2019 - 16:27
Forums

I've been trying to compile FreeLing from git master branch using OpenBSD. It uses clang by default and the make process runs fine until it chokes with part-dep1.cc. It would be nice if I could use FreeLing on my laptop so I can work on my project with the help of my tutor.

I am using boost-1.66.0p4, and clang++ --version outputs

OpenBSD clang version 7.0.1 (tags/RELEASE_701/final) (based on LLVM 7.0.1)

First issue I had were constants like _L, _U, _N that were defined in ctypes.h in BSD but not in Linux. The build would thus fail because the preprocessor would substitute them with values such as 0x000????. I am listing issues below as I find them.

I just solved this by #undef the _L, _N and _U macros in every offending file. There were about a dozen of them. The build goes a little bit further, but then it won't continue due to

/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/parameters.h:426:12: error:
cannot initialize a variable of type 'gzFile *' (aka 'void **') with an rvalue of type
'gzFile' (aka 'void *')

This is the full output:

cd /home/kiike/downloads/FreeLing/FreeLing/build/src/libfreeling && /usr/bin/c++ -DBOOST_ALL_DYN_LINK -DBOOST_ALL_NO_LIB -DPACKAGE_STRING="\"FreeLing 4.1\"" -DVERSION=4.1 -Dfreeling_EXPORTS -I/home/kiike/downloads/FreeLing/FreeLing/src/include -I/home/kiike/downloads/FreeLing/FreeLing/src/libdynet -I/usr/local/include -I/home/kiike/downloads/FreeLing/FreeLing/src/libfoma/. -I/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/. -I/home/kiike/downloads/FreeLing/FreeLing/src/crfsuite/. -I/home/kiike/downloads/FreeLing/FreeLing/src/crfsuite/crfsuite -I/home/kiike/downloads/FreeLing/FreeLing/src/crfsuite/crfsuite/crf -I/home/kiike/downloads/FreeLing/FreeLing/src/crfsuite/crfsuite/cqdb -I/home/kiike/downloads/FreeLing/FreeLing/src/crfsuite/crfsuite/lbfgs -std=c++03 -I/usr/local/include -fPIC -std=gnu++11 -o CMakeFiles/freeling.dir/analyzer.cc.o -c /home/kiike/downloads/FreeLing/FreeLing/src/libfreeling/analyzer.cc
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libfreeling/analyzer.cc:32:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling/morfo/analyzer.h:35:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling.h:51:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling/morfo/dep_treeler.h:49:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/dep/dependency_parser.h:42:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/control/models.h:50:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/tag/tag.h:43:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/tag/fgen-tag.h:9:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/feature-vector.h:42:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/fidx.h:41:
/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/feature-idx-v0.h:71:7: warning:
'register' storage class specifier is deprecated and incompatible with C++17
[-Wdeprecated-register]
register uint32_t a = (uint32_t)(t & 0xffffffff);
^~~~~~~~~
/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/feature-idx-v0.h:73:7: warning:
'register' storage class specifier is deprecated and incompatible with C++17
[-Wdeprecated-register]
register uint32_t b = (uint32_t)((t >> 32) & 0xffffffff);
^~~~~~~~~
/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/feature-idx-v0.h:75:7: warning:
'register' storage class specifier is deprecated and incompatible with C++17
[-Wdeprecated-register]
register uint32_t c = 0;
^~~~~~~~~
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libfreeling/analyzer.cc:32:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling/morfo/analyzer.h:35:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling.h:51:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling/morfo/dep_treeler.h:49:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/dep/dependency_parser.h:42:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/control/models.h:50:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/tag/tag.h:44:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/scores.h:39:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/parameters.h:43:
/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/base-parameters.h:352:12: error:
cannot initialize a variable of type 'gzFile *' (aka 'void **') with an rvalue of type
'gzFile' (aka 'void *')
gzFile out = gzopen(fname, "w");
^ ~~~~~~~~~~~~~~~~~~
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libfreeling/analyzer.cc:32:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling/morfo/analyzer.h:35:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling.h:51:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/include/freeling/morfo/dep_treeler.h:49:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/dep/dependency_parser.h:42:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/control/models.h:50:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/tag/tag.h:44:
In file included from /home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/scores.h:39:
/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/parameters.h:425:12: error:
cannot initialize a variable of type 'gzFile *' (aka 'void **') with an rvalue of type
'gzFile' (aka 'void *')
gzFile out = gzopen(fname, "w");
^ ~~~~~~~~~~~~~~~~~~
/home/kiike/downloads/FreeLing/FreeLing/src/libtreeler/./treeler/base/parameters.h:480:12: error:
cannot initialize a variable of type 'gzFile *' (aka 'void **') with an rvalue of type
'gzFile' (aka 'void *')
gzFile in = gzopen(fname, "r");
^ ~~~~~~~~~~~~~~~~~~
3 warnings and 3 errors generated.
*** Error 1 in . (src/libfreeling/CMakeFiles/freeling.dir/build.make:207 'src/libfreeling/CMakeFiles/freeling.dir/analyzer.cc.o')
*** Error 1 in . (CMakeFiles/Makefile2:459 'src/libfreeling/CMakeFiles/freeling.dir/all')
*** Error 1 in . (CMakeFiles/Makefile2:612 'src/main/CMakeFiles/analyzer.dir/rule')
*** Error 1 in /home/kiike/downloads/FreeLing/FreeLing/build (Makefile:281 'analyzer')

Casting that gzopen call and similar ones into (void**) did the trick. Whether that's a good idea, I really don't know. All the C i know comes from programming my arduino. Anyway, it compiles without errors so it's now time to try it the tools actually work on OpenBSD (it would be surprised if they didn't, actually).

The cast may be different if you are using a different version of libz (or if the library has a different API in BSD), so that might not be a problem.  In any case, those changes affect only dep_txala parser and srl_parser. If you have crashes when instantiating those modules, that may be the reason

Regarding the #undefs, I think that should work too...

Let us know if you have further problems.

Hi Lluis! Thanks for the reply. I haven't used dep_txala or srl_parser yet. Just for completeness I should actually try them, I guess.

No other problems so far. I have made a FreeLing context manager for Python, and as far as my usage of FreeLing goes, it just works. One issue I have found is that in the analyzer program, the encoding of the output will be wrong. Using the python interface doesn't produce the wrong-encoded output.

For instance, given the sentence "L'àbac és una màquina útil pel càlcul manual d'operacions aritmètiques.", the output i'll get using analyzer is
L' el DA0CS0 0.983045
� � NP00000 1
� � Fz 1
bac bac NCMS000 1
� � NP00000 1
� � Fz 1
s segon NCMN000 1
una un DI0FS0 0.931109
m� m� NCFS000 0.168974
� � Fz 1
quina quin DT0FS0 0.601796
útil útil NP00000 1
per per SP 1
el el DA0MS0 1
c� c� NCFS000 0.168974
� � Fz 1
lcul lcul NCMS000 1
manual manual AQ0CS00 0.403846
d' de SP 1
operacions operaci� NCFP000 1
aritm� aritm� AQ0CS00 0.0373171
� � Fz 1
tiques tiques AQ0FP00 0.555113
. . Fp 1

This is the representation so you can see what are the codepoints of the wrong characters (i've also pasted the hex dump to http://sprunge.us/8Ritf4)
L' el DA0CS0 0.983045
\xc3 \xc3 NP00000 1
\xa0 \xa0 Fz 1
bac bac NCMS000 1
\xc3 \xc3 NP00000 1
\xa9 \xa9 Fz 1
s segon NCMN000 1
una un DI0FS0 0.931109
m\xc3 m\xc3 NCFS000 0.168974
\xa0 \xa0 Fz 1
quina quin DT0FS0 0.601796
\xc3\xbatil \xc3\xbatil NP00000 1
per per SP 1
el el DA0MS0 1
c\xc3 c\xc3 NCFS000 0.168974
\xa0 \xa0 Fz 1
lcul lcul NCMS000 1
manual manual AQ0CS00 0.403846
d' de SP 1
operacions operaci\xf3 NCFP000 1
aritm\xc3 aritm\xc3 AQ0CS00 0.0373171
\xa8 \xa8 Fz 1
tiques tiques AQ0FP00 0.555113
. . Fp 1

EDIT: According to chardet, that might be ISO8859-1 encoding. OpenBSD dropped support for non-UTF8 locales ages ago so I can't comprehend why analyzer gives such output.

FreeLing supports only UTF8, both in input and output

From the visualization you show, it seems you are trying to view UTF8 as if it were iso8859-1.

For instance the "à" character in "àbac" has UTF8 code 0xC30A, which are the bytes you are getting, just that your terminal or editor interprets them as two separate bytes.

If you use python2, it may be tricky when it comes to UTF encoding and decodings, you may need to encode/decode the string before/after reading/writting
In python3 it is much simpler.