The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A document surrogate is usually represented in a list of words. Because not all words in a document reflect its content, it is necessary to select important words from the document that relate to its content. Such important words are called keywords and are selected with a particular equation based on Term Frequency
Consider an information repository whose content is categorized. A data item (in the repository) can belong to multiple categories and new data is continuously added to the system. In this paper, we describe a system, CS*, which takes a keyword query and returns the relevant top-K categories. In contrast, traditional
Internet is becoming an increasingly important platform for ordinary life and work. It is expected that keyword extraction can help people quickly find hot spots on the web, since keywords in a document provide important information about the content of the document. In this paper, we propose to use text clustering
Keywords can be considered as condensed versions of documents, which can play important role in some text processing tasks such as text indexing, summarization and categorization. However, there are many digital documents especially on the Internet that do not have a list of assigned keywords. Assigning keywords to
The relevance feedback techniques have been studied in the field of document retrieval, aiming to generate appropriate queries for userspsila information needs. Conventional relevance feedback techniques are performed on document space, while the resultant queries should be represented in keyword space. In this paper
The relevance feedback techniques have been studied in the field of document retrieval, aiming to generate appropriate queries for userspsila information needs.Conventional relevance feedback techniques are performed on document space, while the resultant queries should be represented in keyword space. In this paper
methods for Indonesian corpus is rather small. Brace well's algorithm has been proven effective in identifying topics in English and Japanese corpora with high accuracy. This paper implements a method for TID based on Brace well's keywords similarity algorithm and the top-n keywords selection for Indonesian news documents
classification/clustering as features. Also, this approach can be applied in keyword recommendation system in advertisement for different kinds of advertisers because of its expansibility and versatility.
for patent queries because the inherent search systems come from traditional keyword-based models, which inevitably lead to too many unrelated items in the search results. Consequently, these systems cost the patent experts lots of time to iteratively refine search results manually. In this paper, we propose a
prioritized automatically by combining keyword-based information retrieval and descriptive statistics. Moreover, we show in an evaluation that the obtained results are reasonable.
topic and their attributes have to be shared to have a more accurate estimation of the global classifier at every node. The network traffic should be kept at a minimum to reduce costs. We propose a probabilistic model for a keyword selection method, which makes a more thorough analysis possible and can be considered as a
Topic tracking is to track trend of news topic, which people are interested in. It is a very pragmatic method in information retrieval. Compared with keywords retrieval, topic tracking excels in dynamic tracking based on text model and its content understanding, so it is mostly involved in text expressing and semantic
Traditional information retrieval (IR) systems evaluate user queries and retrieve/rank documents based on matching keywords in user queries with words in documents.These exact word-matching and ranking approaches ignore too many relevant documents that do not contain the exact keywords as specified in a user query
propose n keywords, in order to optimise the information gain expectation. Its implementation, CFAsT, endeavours to keep the best from both worlds: the universality and automatic generation from search engines, and the usability, the assistance and the self optimisation provided by the dialogue systems. Thus, a beta dialogue
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.