The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
An approach for the detection of decorative elements - such as initials and headlines - and text regions, focused on ancient manuscripts, is presented. Due to their age, ancient manuscripts suffer from degradation and staining as well as ink is faded-out over the time. Identifying decorative elements and text regions allows indexing a manuscript and serves as input for Optical Character Recognition...
This paper proposes a new technique for binalizing multicolored characters subject to heavy degradations. The key ideas are threefold. The first is generation of tentatively binarized images via every dichotomization of k clusters obtained by k-means clustering in the HSI color space. The total number of tentatively binarized images equals 2k-2. The second is use of support vector machines (SVM) to...
This paper presents a text localization approach for binarized printed document images. Emphasis is given to the feature extraction and feature selection stages. In the former, several document structure elements and spatial features, likely to convey useful information, are extracted. In the latter, evolutionary multi-objective feature selection is employed to identify combinations of features with...
The paper describes a new approach using a conditional random fields (CRFs) to extract physical and logical layouts in unconstrained handwritten letters such as those sent by individuals to companies. In this approach, the extraction of the layouts is considered as a labeling task consisting in assigning a label to each pixel of the document image. This label is chosen among a set of labels depicting...
The paper presents a clutter detection and removal algorithm for complex document images. The distance transform based approach is independent of clutter's position, size, shape and connectivity with text. Features are based on a residual image obtained by analysis of the distance transform and clutter elements, if present, are identified with an SVM classifier. Removal is restrictive, so text attached...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.