The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
facility. Automating the transcription of these documents using Optical Character Recognition (OCR) systems is also challenging due to the very complex cursive nature of Urdu text. To overcome these limitations, a keyword spotting based information retrieval system for document images is introduced in this study. The proposed
In this work, we propose a new descriptor that is called Gradient Local Binary Patterns (GLBP) for automatic keyword spotting in handwritten documents. GLBP is a gradient feature that improves the Histogram of Oriented Gradients (HOG) by calculating the gradient information at transitions of the Local Binary Pattern
The task of zero resource query-by-example keyword search has received much attention in recent years as the speech technology needs of the developing world grow. These systems traditionally rely upon dynamic time warping (DTW) based retrieval algorithms with runtimes that are linear in the size of the search
Feature weighting is a technique used to approximate the optimal degree of influence of individual features. This paper presents a feature weighting method for Document Image Retrieval System (DIRS) based on keyword spotting. In this method, we weight the features using Weighted Principal Component Analysis (PCA). The
We propose a fully automatic method for summarizing and indexing unstructured presentation videos based on text extracted from the projected slides. We use changes of text in the slides as a means to segment the video into semantic shots. Unlike precedent approaches, our method does not depend on availability of the electronic source of the slides, but rather extracts and recognizes the text directly...
In this paper we propose a novel and efficient technique for finding keywords typed by the user in digitised machine-printed historical documents using the dynamic time warping (DTW) algorithm. The method uses word portions located at the beginning and end of each segmented word of the processed documents and try to
This paper presents a code-search method, which includes an algorithm of keyword code-search and a prototype implementation. In this paper, a query is a set of keywords and a search result is a set of execution paths fulfilling the query, that is, each of the execution paths includes all of the keywords. Here, an
direction-aware spatial keyword search method which inherently supports direction-aware search. We devise novel direction-aware indexing structures to prune unnecessary directions. We develop effective pruning techniques and search algorithms to efficiently answer a direction-aware query. As users may dynamically change their
This paper presents a new way for keyword spotting in degraded imaged document. Two prevalent word indexing, OCR and word shape coding, are combined compactly based on the recognition confidence evaluation. The basic procedures are as follows. First, OCR candidates are used for OCR indexing. Second, a new stoke
issued to the databases also contain spatial and textual components, for example, "Find shelters with emergency medical facilities in Orange County," or "Find earthquake-prone zones in Southern California." We refer to such queries as spatial-keyword queries or SK queries for short. In recent times, a lot of interest has
Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist
Keyword spotting in video document images is challenging due to low resolution and complex background of video images. We propose the combination of Texture-Spatial-Features (TSF) for keyword spotting in video images without recognizing them. First, a segmentation method extracts words from text lines in each video
The total information available on WWW (World Wide Web) is huge and is increasing at lightning speed. Existing web is dominated by Search Engines which are running on keyword based search system which in turn leads to wastage of end user's precious time if he do not know the key terms which are utilized to index
A document surrogate is usually represented in a list of words. Because not all words in a document reflect its content, it is necessary to select important words from the document that relate to its content. Such important words are called keywords and are selected with a particular equation based on Term Frequency
With large databases of document images available,a method for users to find keywords in documents will be useful. One approach is to perform Optical Character Recognition (OCR) on each document followed by indexing of the resulting text. However, if the quality of the document is poor or time is critical,complete OCR
to the difficulty of quickly accessing the content of interest in a long video lecture. In this work, we present “video indexing” and “keyword search” that facilitate access to video content and enhances user experience. Video indexing divides a video lecture into segments indicating
platform, N-gram and word co-occurrence statistical analysis are combined to carry out Chinese keyword extraction experiment. Firstly, candidate keywords are extracted with bi-gram model. Then, a set of co-occurrences between every word in bi-grams and frequent words is generated. Co-occurrence distribution shows importance
In pursuing the development of Yanii, a novel keyword based search system on graph structures, in this paper we present the computational complexity study of the approach, highlighting a comparative study with actual PTIME state-of-the-art solutions. The comparative study focuses on a theoretical analysis of different
In this study, we present a pre-filtering method for dynamic time warping (DTW) to improve the efficiency of a posteriorgram based keyword search (KWS) system. The ultimate aim is to improve the performance of a large vocabulary continuous speech recognition (LVCSR) based KWS system using the posteriorgram based KWS
images automatically. Cluster IDs are adopted to index the characters. A Dream of Red Mansions, a famous classical Chinese literature work including near one million characters, is used to evaluate the performance of Chinese keyword spotting. Experimental results confirm the effectiveness of knowledge-based clustering and
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.