The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
database, cannot be applied to Web search. We propose a new method to support Web query refinement. Our methods is based on local analysis which clustering the search result. Unlike other clustering-base approaches, we take into consideration the distance between keywords, and guarantee no information loss. A Web search
Traditional keyword-based document clustering techniques have limitations due to simple treatment of words and hard separation of clusters. In this paper, we introduce named entities as objectives into fuzzy document clustering, which are the key elements defining document semantics and in many cases are of user
Keyword (Feature) selection enhances and improves many Information Retrieval (IR) tasks such as document categorization, automatic topic discovery, etc. The problem of keyword selection is usually solved using supervised algorithms. In this paper, we propose an unsupervised approach that combines keyword selection and
This paper presents the comparison of the text document space dimension reduction and the text document clustering and also the keyword space dimension reduction and keyword clustering by the latent semantic analysis and by the Hebbian neural network with Oja learning rule. Results of this neural network are compared
Traditional Web search engines mostly adopt a keyword-based approach. When the keyword submitted by the user is ambiguous, search result usually consists of documents related to various meanings of the keyword, while the user is probably interested in only one of them. In this paper we attempt to provide a solution to
With the increasing number of Web documents in the Internet, the most popular keyword-matching-based search engines, such as Google, often return a long list of search results ranked based on their relevance and importance to the query. To cluster the search engine results can help users find the results in several
Since keyword-based search engine usually return large amount of results in which there are many unrelated documents and many documents with same content, automatic clustering technology is used to classify the retrieval results. While there are large amount of Web retrieval results, the clustering process usually
We introduce a new method for discovering latent topics in sets of objects, such as documents. Our method, which we call PARIS (for Principal Atoms Recognition In Sets), aims to detect principal sets of elements, representing latent topics in the data, that tend to appear frequently together. These latent topics, which we refer to as `atoms', are used as the basis for clustering, classification, collaborative...
A simple search keyword usually returns million of search results. The result count may appear impressive, at the same time it confuse the users. User usually will not wish to browse through million of entries. This paper proposed a query refinement method by iterative clustering of information from the Web page
This correspondence presents a novel hierarchical clustering approach for knowledge document self-organization, particularly for patent analysis. Current keyword-based methodologies for document content management tend to be inconsistent and ineffective when partial meanings of the technical content are used for
metadata, our approach is able to handle a wide range of forms, including content-rich forms that contain multiple attributes, as well as simple keyword-based search interfaces. An experimental evaluation over real Web data shows that our strategy generates high-quality clusters - measured both in terms of entropy and F
With the increasing amount of unstructured content available electronically on the web, content categorization becomes very important for efficient information retrieval. The basic approaches for information retrieval in text documents are searching using keywords, categorization of the documents and filtering out the
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.