The paper presents a system for automatic content extraction from mammogram reports written in Polish. The system combines general information extraction (IE) techniques with external post-processing aimed at structuralizing the results. The paper contains a characteristics of the specific type of texts as well as a description of the results obtained together with a short analysis of advantages and disadvantages of shallow text processing.
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.