The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Simple keyword based searches are ubiquitous in today's internet age. It is hard to imagine an information system today that does not permit a simple keyword based search. This method of information retrieval has the obvious benefits of being highly interpretable, and having wide usage. However, a general perception
Keyword (Feature) selection enhances and improves many Information Retrieval (IR) tasks such as document categorization, automatic topic discovery, etc. The problem of keyword selection is usually solved using supervised algorithms. In this paper, we propose an unsupervised approach that combines keyword selection and
sense discovery problem. Given a query and a list of result pages, our unsupervised method detects word sense communities in the extracted keyword network. The documents are assigned to several refined word sense communities to form clusters. We use the modularity score of the discovered keyword community structure to
appearance characteristics, so called visual features. This paper proposes a method to cluster the scientific documents based on visual features, so called VF-Clustering algorithm. Five kinds of visual features of documents are de-fined, including body, abstract, subtitle, keyword and title. The thought of crossover and
In document categorization method by using similarity measures based on word vectors, it is important to determine key words to characterize each document. However, conventional methods select the key words based on their frequency or/and particular importance index such as tf-idf. In this paper, we propose a method to characterize each document by using temporal clusters of technical term usages...
Document clustering is to group documents according to a certain semantic features defined on the document set for measuring the similarities between two documents. The keyword models such as the TFIDF model of document have been widely used as features for document clustering. But it lacks of semantic structure
from these data collections. KeyGraph is a word co-occurrence based algorithm for topic modeling. We provide an extension for KeyGraph algorithm by incorporating WordNet hypernyms for Keywords in the data collection. Our results show that incorporating hypernyms for KeyGraph algorithm would result improved topic and
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.