aligning MWT's and named entities with simply tokenized text
I'm working on an application which requires aligning a text (in Spanish, though I imagine this applies in general) where I need to associate the lemma with each token for a text, tokenized "naturalistically", let's say. So far I've understood how to get Freeling to produce "naturalistic" tokenization but at the cost of the rest of the analysis, at least for that token. The conllu spec obviously allows for mwt's to be represented in a way to preserve the original tokenization, but I haven't been able to find a way to get Freeling to output something like this.