The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist
A document surrogate is usually represented in a list of words. Because not all words in a document reflect its content, it is necessary to select important words from the document that relate to its content. Such important words are called keywords and are selected with a particular equation based on Term Frequency
With large databases of document images available,a method for users to find keywords in documents will be useful. One approach is to perform Optical Character Recognition (OCR) on each document followed by indexing of the resulting text. However, if the quality of the document is poor or time is critical,complete OCR
This paper proposes an extended vector space model (VSM), which is called M2VSM (meta keyword-based modified VSM). When conventional VSM is applied to document clustering, it is difficult to adjust the granularity of cluster in terms of topic. In order to solve the problem, M2VSM considers meta keywords such as
needs. In this paper, we present the design, architecture and implementation of an open-source keyword-based paradigm for the search of software resources in Grid infrastructures, called Minersoft. A key goal of Minersoft is to annotate automatically all the software resources with keyword-rich metadata. Using advanced
, indexing is the only way of answering queries on a best-effort basis. To the best of our knowledge, no work has been done on providing relevant results to a query beyond those provided by indexing without manual efforts. The approach proposed in this paper stores relationships between keywords extracted from the user queries
This paper describes experiments for audio clips comparison based on spoken context. The spoken content is obtained using automatic speech recognition. The social tags that are available for most of the audio clips are used as keywords. These keywords are mapped to the spoken transcription representing the audio clips
Text chance discovery is the process of extracting author's potential hidden issue from a large number of texts. For the main question keyword (i.e. Chance) extracting, we propose a framework of text chance discovery system based on immune and multi-agent in this paper. By immunization and agent self-learning, this
ordinary users to use. In this paper, we propose a novel keyword-based user interface system EasyUI for achieving web-scale data integration and easy to use for ordinary users. Dealing with heterogeneity on the web-scale presents many new challenges. We proposed new methods to address these challenges, i.e., indexing schemata
In this paper, we present an ontology-based information extraction and retrieval system and its application to soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of
Remote Electronic Document (CReED) provided an access control to all documents that will grant different privileges to each user of the system. It also utilized a keyword analyzer and result matcher that will make searching and retrieving of documents faster and easier. CReED used a scanner device and file importing tool to
generally have problems on keyword-search problem. In this paper, we proposed an initial model to solve the problem by using Case-Based Reasoning (CBR) and Formal Concept Analysis (FCA). For the proposed model, a case base is created to represent design patterns. FCA is used to be case organization that analyze case base for
use of the system. The soccer videos are very suitable for our framework, since it is easy to find Web-cast match reports for soccer games. The annotated videos are stored in MPEG-7 format in an object-oriented database. The keyword-based indexing allows fast retrieval of video segments. The system accepts match reports
In a wide array of disciplines, data can be modeled as an interconnected network of entities, where various attributes could be associated with both the entities and the relations among them. Knowledge is often hidden in the complex structure and attributes inside these networks. While querying and mining these linked datasets are essential for various applications, traditional graph queries may not...
automatic transcription of a spoken document using a speech recognizer. The difficult point of this task is that the automatic transcription contains many recognition errors, therefore we cannot trust keywords extracted from the automatic transcription using conventional method such as tfmiddotidf. To solve this problem, we
With the rapid development of Internet technology, information resources on the Internet become more abundant, but also bring some problems like diversity, heterogeneity, disorder, and redundancy. Given a brief expression like search keywords only, users' needs are ambiguous. Therefore, current technologies of search
in both Thai and English is built for helping users from a lot of keywords of the same term and (3) a set of keywords from herbal usages can be combined with the name keyword. From the results, information collected from KUIHerb is useful for searching.
Supporting complex and efficient lookup queries in peer-to-peer networks is challenging, though simple keyword based lookup queries are well supported by most deployed systems. This paper presents a two-level indexing structure built on distributed hash table (DHT) aiming to support range queries on high-dimensional
Reusing software components (e.g. classes or modules) improves software quality and developer's productivity. Unfortunately, developers may miss many reusing opportunities since current keyword based component search systems cannot provide reusable components if the developers do not use them. This paper proposes a
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.