Segfault when parsing sentence with a quantity in the form of <number>M from Wikipedia text.

Submitted by enricm on Wed, 03/13/2019 - 16:38
Forums

This segfault happens with Freeling v4.1 compiled with gcc (GCC) 8.2.1 20181127.

Steps to reproduce:

1. create test-1 file with sentence:

Si la massa de l'estrella se situa entre 0,5 i 8M, en esgotar tot l'hidrogen, el seu nucli posseïx una temperatura alta.

2. run analyzer in gdb with arguments run -f ~/.../ca-vanilla.cfg < ~/.../test-1
3. program will crash with
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff798ae75 in __memmove_avx_unaligned_erms () from /usr/lib/libc.so.6

gdb backtrace:

#0 0x00007ffff798ae75 in __memmove_avx_unaligned_erms () from /usr/lib/libc.so.6
#1 0x00007ffff7b46b91 in std::char_traits::copy (__n=81, __s2=, __s1=0x55555dbe91f0 L"")
at /build/gcc/src/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:462
#2 std::__cxx11::basic_string, std::allocator >::_S_copy (__n=81, __s=, __d=0x55555dbe91f0 L"")
at /build/gcc/src/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:340
#3 std::__cxx11::basic_string, std::allocator >::_M_mutate (this=this@entry=0x7fffffffd588, __pos=0,
__len1=__len1@entry=0, __s=, __len2=81) at /build/gcc/src/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:322
#4 0x00007ffff7b47a9c in std::__cxx11::basic_string, std::allocator >::_M_replace (this=0x7fffffffd588,
__pos=, __len1=0, __s=, __len2=)
at /build/gcc/src/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:462
#5 0x00007ffff7da7d66 in int freeling::util::wstring_to(std::__cxx11::basic_string, std::allocator > const&) ()
from /home/kiike/downloads/freeling/src/FreeLing-4.1/build/src/libfreeling/libfreeling.so
#6 0x00007ffff7da1873 in freeling::dates_ca::StateActions(int, int, int, std::_List_const_iterator, freeling::dates_status*) const ()
from /home/kiike/downloads/freeling/src/FreeLing-4.1/build/src/libfreeling/libfreeling.so
#7 0x00007ffff7e4b6b6 in freeling::maco::analyze(freeling::sentence&) const ()
from /home/kiike/downloads/freeling/src/FreeLing-4.1/build/src/libfreeling/libfreeling.so
#8 0x00007ffff7d43afe in void freeling::analyzer::do_analysis > >(std::__cxx11::list >&) const () from /home/kiike/downloads/freeling/src/FreeLing-4.1/build/src/libfreeling/libfreeling.so
#9 0x00005555555663ee in process_text_incremental(freeling::analyzer&, ServerStats&, freeling::io::output_handler const&, bool) ()
#10 0x0000555555563833 in main ()

Adding a space between the 8 and the M gets rid of the error. This bug doesn't occur in master branch.

Yes, that bug was fixed in master some months ago (I also detected it when processing wikipedia  ;)

https://github.com/TALP-UPC/FreeLing/commit/557b39d34b200f1f0a162f722c22363105bc888e

thank you!