The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Approximate nearest neighbor (ANN) search provides computationally viable option for retrieval from large document collection. Hashing based techniques are widely regarded as most efficient methods for ANN based retrieval. It has been established that by combination of multiple features in a multiple kernel learning setup can significantly improve the effectiveness of hash codes. The paper presents...
This paper presents a novel learning based framework to extract articles from newspaper images using a Fixed-Point Model. The input to the system comprises blocks of text and graphics, obtained using standard image processing techniques. The fixed point model uses contextual information and features of each block to learn the layout of newspaper images and attains a contraction mapping to assign a...
Writer recognition based on peculiarity of hand-writing is an important aspect of any forensic analysis. We present an approach for selecting best discriminative primitives for writer recognition. After selecting the primitives we also propose a hybrid system by combining both writer recognition and handwriting recognition for improved accuracy. We have also validated the performance of selected primitives...
Active learning and crowd sourcing are becoming increasingly popular in the machine learning community for fast and cost effective generation of labels for large volumes of data. However, such labels may be noisy. So, it becomes important to ignore the noisy labels for building of a good classifier. We propose a framework for finding the best possible augmentation of a classifier for the character...
The paper presents a novel script independent CRF based inferencing framework for character recognition. In this framework we consider a word as a sequence of connected components. The connected components are obtained using different binarization schemes and different possible sequences are considered using a tree structure. CRF uses contextual information to learn perfect primitive sequences and...
In this paper we present an approach for correcting character recognition errors of an OCR which can recognise Indic Scripts. Suffix tree is used to index the lexicon in lexicographical order to facilitate the probabilistic search. To obtain the best probable match against the mis-recognised string, it is compared with the sub-strings (edges of suffix tree) using similarity measure as weighted Levenshtein...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.