The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This article presents our recent study on multi colored text binarization. In the output image, we represented foreground content as black and background as white regardless the polarity of foreground and background in original image. Here we applied connected component analysis based approach to group the words or characters within bounding or edge box. The main novelty of this reported work includes...
Text line detection and localisation is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a novel approach for text line localisation based on Convolutional Neural Networks and Multidimensional Long Short-Term Memory cells as a regressor in order to predict the coordinates of the text line bounding boxes directly...
Semi-automatic text analysis involves manual inspection of text. Often, different text annotations (like part-of-speech or named entities) are indicated by using distinctive text highlighting techniques. In typesetting there exist well-known formatting conventions, such as bold typeface, italics, or background coloring, that are useful for highlighting certain parts of a given text. Also, many advanced...
In this paper, we present a hybrid method consisting of three main stages for detecting tables in document images. Based on table structure, our system separates table into two main categories, ruling line table and non-ruling line table. In the first stage, the text and non-text elements in document are classified by a heuristic filter. Then, the white space analysis is used to group the text elements...
Text plays its vital role in visual content analysis and understanding. Videos contain text with diversity in its text patterns and complex backgrounds. In this paper, we propose an approach based on compass operator for detecting the edges. We obtain the edge maps by convolving the Kirsch Directional Masks along eight different directions for the preprocessed video frame. The resultant images are...
Various fashion theories have been proposed to explain how fashion works and why it works that way. However, there is little research empirically examining fashion designers' influences even though the benefit of understanding this field is significant. Unlike many other innovation domains such as patents where citations are explicit, a fashion designer hardly claims that s/he is influenced by others...
In this paper we investigate the importance of individual features for the task of document layout analysis, in particular for the classification of the document pixels. The feature set consists of numerous state-of-the-art features, including color, gradient, and local binary patterns (LBP). To deal with the high dimensionality of the feature set, we propose a cascade of an adapted forward selection...
In this paper we present a physical structure detection method for historical handwritten document images. We considered layout analysis as a pixel labeling problem. By classifying each pixel as either periphery, background, text block, or decoration, we achieve high quality segmentation without any assumption of specific topologies and shapes. Various color and texture features such as color variance,...
From a single low resolution image, a real-time document image super-resolution algorithm is proposed to obtain high resolution document image with sharp text boundaries. First, a highly efficient document image matting algorithm based on local linear modeling is designed to decompose the input image into text, foreground and background layers, which contain the text edge information, the color information...
In this paper we present a two-level method to detect text in natural scene images. In the first level, connected components (referred as CCs) are got from the images. Then candidate text lines are extracted and groups of connected components that align in horizontal or vertical direction are got. We think CCs in these groups have high probability are texts. To validate which CC is text, a SVM is...
Human detection in computer vision field is an active field of research. Extending this to human-like drawings such as the main characters in comic book stories is not trivial. Comics analysis is a very recent field of research at the intersection of graphics, texts, objects and people recognition. The detection of the main comic characters is an essential steptowards a fully automatic comic book...
Form Classification has not been focused on for the last decade. Unfortunately the algorithms published mainly in the 80s and 90s do not meet the requirements in our present commercial document analysis projects. There we are confronted with conditions and requirements unanticipated by that research, such as fax distortions and - even worse - form variations. In this work we introduce a new color-coded...
Images are the increases day by day on the Internet.Retrieving relevant images from a large collection of database has become an important research topic. This paper focus on the reranking of images by utilizing the both the visual and textual features.So given a textual query in traditional image retrieval, relevant images are to be re-ranked using visual features after the initial text-based search...
We present a novel method for reducing the effects of ink-bleed in handwritten documents. We go beyond the existing works on ink bleed detection and removal. We consider each pixel in a document as a result of combination of foreground, ink-bleed and background. We carry of a decomposition of the document image into separate foreground ink, ink-bleed, and background Layers. We propose an efficient...
Human ground-truthing is the manual labelling of samples (pixels for example) to generate reference data without any automatic algorithm help. Although a manual ground-truth is more accurate than a machine ground-truth, it still suffers from mislabeling and/or judgement errors. In this paper we propose a new method of ground-truth estimation using multispectral (MS) imaging representation space for...
Correct segmentation of a web table into its component regions is the essential first step to understanding tabular data. Our algorithmic solution to the segmentation problem relies on the property that strings defining row and column header paths uniquely index each data cell in the table. We segment the table using only "logical layout analysis" without resorting to any appearance features...
Urdu script uses superset of Arabic alphabet, but uses Nastaliq writing style. Nastaliq script is highly cursive, context sensitive and is written diagonally from top right to bottom left with stacking of characters, which makes it very hard to process for OCR. In addition, line and word segmentation are non-trivial tasks as we have frequently merging lines and vertically overlapping words and ligatures...
In this paper, we propose an efficient scene text localization method using gradient local correlation, which can characterize the density of pair wise edges and stroke width consistency to get a text confidence map. Gradient local correlation is insensitive to the gradient direction and robust to noise, small character size and shadow. Based on the text confidence map, the regions with high confidence...
Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are...
We employ Eigenfaces to discriminate between handwritten and machine-printed text at the connected component (CC) level. Normalized images of machine print CCs are treated as points in a high-dimensional space. PCA yields a reduced-dimensional character space. Representative machine print CCs are projected into character space and a local distance threshold for each representative is automatically...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.