Forums
Dear all,
Do you know how to get, from a dicc.dat/dicc.txt file, a file with all the lemas and all the tokens forms, in oder to load it in a concordance program such as AntConc?
casa -> casa, cases
menjar -> menjo, menjava, menjaré, mengin...
Could yo recommend me some kind of instruction, such as awk...?
Actually, the proper format…
Actually, the proper format should be:
casa TAB -> TAB casa TAB cases
menjar TAB -> TAB menjo TAB menjava TAB menjaré TAB mengin
(etcetera...)
Once you install FreeLing, a…
Once you install FreeLing, a file is created in /usr/local/share/freeling/ca/dicc.src, which contains something very close to what you ask, and that should be easy to adapt with a simple awk command, a small python program, or even loading the file in a spreadsheet
Yes, an AWK command would be…
Yes, an AWK command would be great. But I have no idea of which would be the one.
AWK is an option for what…
AWK is an option for what toy want to do, not a requirement.
If you don't know AWK, you can use a small python script. Or Perl. Or any other programming language
If you are not a programmer, you probably can achieve the same loading the data in a spreadsheet like excel or openoffice.