The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Two keyword-extraction ways are usually used, one is simply using the information from exactly single word like word frequency and TF.IDF, the other is based on the relationship between words. The relationship is usually described as word similarity which derives from a corpus (WordNet, HowNet) or man-made thesaurus
This paper proposes a new keyword extraction method that uses bag-of-concept to extract keywords from Arabic text. The proposed algorithm utilizes semantic vector space model instead of traditional vector space model to group words into classes. The new method built word-context matrix where the synonym words will be
Classical algorithms of keywords extraction can hardly get low computational complexity and high accuracy. The association rule mining based algorithm is proposed in this paper. This algorithm adopts improved FP-Growth algorithm to extract word co-occurrence information, utilizes the similarity algorithm to eliminate
Unmanned Aerial Vehicles (UAVs) are increasingly popular. As a result of the tremendous number of UAVs, especially recreational UAVs, regulation becomes a challenge we are confronted with. Protocol reverse engineering offers a way to understand and regulate the drones effectively. Extracting keywords is an
appearance characteristics, so called visual features. This paper proposes a method to cluster the scientific documents based on visual features, so called VF-Clustering algorithm. Five kinds of visual features of documents are de-fined, including body, abstract, subtitle, keyword and title. The thought of crossover and
were used as case studies. The textual contents of the marking schemes were transcripted into electronic documents using same file format as the students' answers. The documents were pre-processed for stopwords removal and each keyword stemmed to address morphological variations. N-gram terms (N=2, 3) were then
fragmented time while screen-based reading. The central idea is to utilize semantic analysis programs to extract an extensive set of information that describes keyword spotting. And the auxiliary knowledge can be used for deeply reading. We discuss the strengths of our semantic analysis programs, namely, text extraction, name
This paper surveys Audio Information Retrieval (AIR) using a literature review and classification of articles from 1994 to 2010 with a keyword index and article abstract in order to explore how AIR methodologies and applications have developed during this period. Based on the scope of many papers and journals of AIR
. Especially as a part of the interest router table, each K-bucket stores a certain number of the peers' information that have high interest similarity. The query can be executed in the appropriate k-bucket by calculating interest similarity and interest keyword. Through mining the latent interest, we found that two peers having
There is no previous research that compares the results of k-means, CLOPE clustering and Latent Dirichlet Allocation (LDA) topic modeling algorithms for detecting trending topics on tweets. Since not all tweets contain hashtags, we considered three training data feature sets: hashtags, keywords and keywords + hashtags
keywords from a text and use the trained model to generate similarities among these keywords. Since the word2vec model maps the relations of terms into a semantic space, the similarity of the terms is given by cosine similarity of the vectors. We construct the graph of these terms and its adjacency matrix. Finally, spectral
Text classification is an important research topic for managing numerous electronic documents. Feature reduction is the key issue for text classification with high dimensional keywords. A document analysis method called discriminant coefficient was proposed to reduce features and achieve high precisiontext
-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords
vocabulary. A group-LASSO regularizer is used to drive as many feature weights to zero as possible. We evaluate the quality of the pruned vocabulary by clustering the data using the resulting feature subset. Experiments on PASCAL VOC 2007 dataset using 5000 visual keywords, resulted in around 80% reduction in the number of
obtained by our method are technical phrase frames, i.e., A word sequence that forms a complete technical phrase only after putting a technical word (or words) before or/and after it. We claim that our method is a useful tool for discovering important phrase logical patterns, which can expand query keywords for improving
Automatic image annotation is the process of assigning relevant keywords to the images. It is considered to be potential research area in current scenario. Annotation to an image can be defined as the information which could describe an image by considering three ways i.e. when these images were taken, what are the
a kernel-selected algorithm based on the lowest similarity, afterwards we get the appropriate keywords to label the topic of each cluster. Finally, experiments on 20Newsgruops email dataset show the validity of our approach and the experimental results also well match the labeled human clustering result.
Most of the algorithms proposed in the literature deal with the problem of digital image retrieval. To interpret semantic of image, many researcher use keywords as textual annotation. Concept recognition is a key problem in semantic information searching. In order to be effective and efficient, we proposed a parallel
recognition of the query and the analysis of structure of Web pages. Mutual information (MI) between keywords is used to recognize query phrases; Useful information for the summary is mined from the structure of Web pages. Experimental results show that the improved algorithm performs better in the speed and accuracy of
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.