The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a new way for keyword spotting in degraded imaged document. Two prevalent word indexing, OCR and word shape coding, are combined compactly based on the recognition confidence evaluation. The basic procedures are as follows. First, OCR candidates are used for OCR indexing. Second, a new stoke
Two keyword-extraction ways are usually used, one is simply using the information from exactly single word like word frequency and TF.IDF, the other is based on the relationship between words. The relationship is usually described as word similarity which derives from a corpus (WordNet, HowNet) or man-made thesaurus
A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite
Internet is becoming an increasingly important platform for ordinary life and work. It is expected that keyword extraction can help people quickly find hot spots on the web, since keywords in a document provide important information about the content of the document. In this paper, we propose to use text clustering
Keyword spotting is the task of identifying the occurrences of certain desired keywords in an arbitrary speech signal. Keyword spotting has many applications one of them is telephone routing. In particular, we consider a big company which receives thousands of telephone calls daily. We are interested with the
In this paper, a method of automatic Chinese keyword extraction based on KNN is proposed. Firstly, it preprocesses the document by vector space model. Secondly, it constructs a set of candidate keywords based on KNN method and the labeled dataset. Finally, it post-processes on candidate keywords by the character of
This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can
The problem of automatically extracting the most interesting and relevant keyword phrases in a document has been studied extensively as it is crucial for a number of applications. These applications include contextual advertising, automatic text summarization, and user-centric entity detection systems. All these
In classical image classification approaches, low-level features have been used. But the high dimensionality of feature spaces poses a challenge in terms of feature selection and distance measurement during the clustering process. In this paper, we propose an approach to generate visual keyword and combine both visual
In this paper, an algorithm for extracting keywords without corpus is described. We use the co-occurrence information of the words and the biases of distribution to extract the more important words based on the most frequently appearing words so called reference words. Firstly, the most frequently terms are chosen
This paper focuses on setting up a question-answering oriented biomedical domain, and it applies several different approaches to the different processing phases. Firstly, it uses shallow parser to identify the types of questions and extract the keywords, and the keywords are expanded with UMLS for the purpose of
the userpsilas acoustic signal from a singing voice and retrieves the music information using both lyrics and melody information. The lyrics recognition module uses a keyword spotting system based on text-content of the lyrics by an HMM comparison engine. The melody recognition module extracts pitch and MFCC features
Handwritten word spotting aims at making document images amenable to browsing and searching by keyword retrieval. In this paper, we present a word spotting system based on Hidden Markov Models (HMM) that uses trained subword models to spot keywords. With the proposed method, arbitrary keywords can be spotted that do
With the advent of Web 2.0, RESTful web services are becoming increasingly popular to emphasize the web as platform. There are already many RESTful web services and the number of services is increasing rapidly. Thus, it can be difficult to find specific services using keyword based retrieval. To solve this problem, a
FCA, a session interest concept is defined as a pair of extent and intent where the extent covers a set of documents selected by the user among the search results and the intent covers a set of keyword features extracted from the selected documents. And, in order to make a concept network grow, we need to calculate the
In this paper, a new information extraction system by statistical shallow parsing in unconstrained handwritten documents is introduced. Unlike classical approaches found in the literature as keyword spotting or full document recognition, our approach relies on a strong and powerful global handwriting model. A entire
where our approach is tested on images retrieved from Google keyword based image search engine. The results show that a combination of our approach as a local image descriptor with another global descriptor outperforms other approaches.
Current search engines have two problems, losing useful information and including useless information. These two problems are aroused by the keyword matching retrieval model, which is adopted by almost all search engines. We introduce the conception of category attribute of a word. According to the category attribute
provide simple message analysis features such as browsing and simple keyword-based searching of the recorded messages. In this paper, we propose a system, called IMAnalysis, that supports intelligent chat message analysis using text mining techniques. The IMAnalysis system provides functions on chat message retrieval, social
classification/clustering as features. Also, this approach can be applied in keyword recommendation system in advertisement for different kinds of advertisers because of its expansibility and versatility.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.