The paper presents the annotation of a Slovene language corpus at the semantic level. Manual annotation was performed in two cycles with an automatically generated semantic lexicon according to the wordnet model. The analysis of the results shows that nearly all polysemous words in the corpus can be assigned a sense from our wordnet but also that the task was quite challenging; in many cases, wordnet sense distinctions are too fine-grained even for human annotators to distinguish between them. This is why annotation with more coarse-grained senses could prove to be more successful.
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.