The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Scientific documents are unstructured data consisting of natural language and hard for scientists to read and manage. Keywords are very helpful for scientists to search the related documents and know about their contents in a prompt way. In this paper we investigate a kind of data preprocessing technique used in SVM
Due to the huge number of research articles in the biomedical domain, it becomes more and more important to develop methods to find relevant articles of our specific research interests. Keyword extraction is a useful method to find important topics from documents and summarize their major information. Unfortunately
needs. In this paper, we present the design, architecture and implementation of an open-source keyword-based paradigm for the search of software resources in Grid infrastructures, called Minersoft. A key goal of Minersoft is to annotate automatically all the software resources with keyword-rich metadata. Using advanced
be easily extracted, building respective data banks. Keywords are the important terms, sometimes called, index terms that contain some kind of valuable information about the document. Automatic keyword extraction is the task to identify a small set of words, which can be designated as keywords for that document, and
interest areas coinciding with the related book categories. This paper suggests that bloggerspsila interests can be known through extracting keywords from blog entry titles and using book classification schemes. Because there were instances in which the keywords alone did not provide adequate information, the Naver (Korean
clustering genes is done in two steps: First, keywords corresponding to all genes of interest from a subset of MEDLINE database were extracted automatically using TF-IDF and Z-scores. In the second step, the classic K-means algorithm was used to group genes into clusters of genes based on the keyword features.
The World Wide Web has become a huge repository of data of interest for a variety of application domains. However, the same features that have made the Web so useful and popular also impose important restrictions on the way the data it contains can be manipulated. Particularly, in the traditional Web scenario, there is an inherent difficulty in gaining access to data that is implicitly present in...
Assigning keywords to articles can be extremely costly. In this paper we propose a new approach to biomedical concept extraction using semantic features of concept graphs to help in automatic labeling of scientific publications. The proposed system extracts key concepts similar to author-provided keywords. We
With the development of internet, web information increases fast, how to filter information which users wanted quickly and accurately is becoming a big problem. But the traditional keyword based search system's recall rate and precision are yet to be improved. Kam-so, the user interesting collaborative filtering model
By combining data-driven and keyword-driven technologies and using XML format to store testing data, this paper shows how to design and implement a GUI automated testing framework with strong reusability, expandability and robustness. The separation of scripts, data and business logic divides personnel into framework
Library search systems that use "keywords" are very useful for people who know what they are looking for; however, this system is not helpful for those who do not have specific knowledge of what they are looking for. The purpose of this research is to propose a Kansei-library search system. Based on observing how
Google Scholar is one of the major academic search engines but its ranking algorithm for academic articles is unknown. In recent studies we partly reverse-engineered the algorithm. This paper presents the results of our third study. While the first study provided a broad overview and the second study focused on researching the impact of citation counts, the current study focused on analyzing the correlation...
This paper is started from addressing the common automatic method of ontology construction. Then, from viewpoint of the military intelligent processing, the two-level domain ontology architecture is designed. One level is the keyword ontology. The other level is the instance ontology. Different level has different
, such as remove noise words and stop words for word analysis in next stage. Next, we dig entity words, mark the noun which may be the topic entity. And then, we mine key word, make sure the verb or adjective which may be the topic's keyword. Besides, we mine the popular topic, according to the entity words and keywords to
visualization. From testing result, we reach 89% success rate of keyword extraction using RIDF term weighting method and collecting messages by certain category. General topic about governor election and 13 subtopics was successfully extracted from set data flood in Jakarta.
Keyword-based search engines are becoming increasingly sophisticated, and yet navigating the ever-increasing collection of academic knowledge remains an arduous task. Keeping abreast of relevant scientific literature is often a fragmented process that breaks the workflow of academic writing.
The current grid information retrieval commonly uses the method based on keywords, and the retrieval method depends on the keywords that match their participation in the external form of characters, rather than the concept expressed by their neglect of the semantic information inherent in the word. It brings about
digital library based on topic or concept features. Firstly, documents in a special domain are automatically produced by document classification approach. It integrates the rule-based and statistical method to classify the documents in the large-scale collection. Then, the keywords of each document are extracted using the
This paper is to introduce a new approach to build topic digital library using concept extraction and document clustering. Firstly, documents in a special domain are automatically produced by document classification approach. Then, the keywords of each document are extracted using the machine learning approach. The
to search and retrieve components. Proposed technique helps re-user to identify and retrieve software component. In its first step it matches keywords, their synonyms and their interrelationships. And then makes use of ant colony optimization, a probabilistic approach to generate rule for matching the component against
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.