The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We propose a fully automatic method for summarizing and indexing unstructured presentation videos based on text extracted from the projected slides. We use changes of text in the slides as a means to segment the video into semantic shots. Unlike precedent approaches, our method does not depend on availability of the electronic source of the slides, but rather extracts and recognizes the text directly...
platform, N-gram and word co-occurrence statistical analysis are combined to carry out Chinese keyword extraction experiment. Firstly, candidate keywords are extracted with bi-gram model. Then, a set of co-occurrences between every word in bi-grams and frequent words is generated. Co-occurrence distribution shows importance
This paper proposes an extended vector space model (VSM), which is called M2VSM (meta keyword-based modified VSM). When conventional VSM is applied to document clustering, it is difficult to adjust the granularity of cluster in terms of topic. In order to solve the problem, M2VSM considers meta keywords such as
Keyword indexing is widely used in natural language processing. This paper proposed an unsupervised keyword indexing method based PageRank and HowNet. In the method, a free text is firstly represented as a sememe graph with sememes as vertices and relatedness of sememes as weighted edges based on HowNet. Then UW
huge irrelevant search hits. In this paper, we propose an improved method for ranking of search results to reduce human efforts on locating interesting hits. The search results are re-ranked using adaptive user interest hierarchies (AUIH), which considers both investigator-defined keywords and user interest learnt from
-specific keywords. An automated profiling algorithm is proposed for this purpose, which starts from generic/noisy reviewer profiles extracted using Google Scholar and derives custom conference-centric reviewer and paper profiles. Each reviewer is expert on few sub-topics, whereas the pool of reviewers and the conference may
Text chance discovery is the process of extracting author's potential hidden issue from a large number of texts. For the main question keyword (i.e. Chance) extracting, we propose a framework of text chance discovery system based on immune and multi-agent in this paper. By immunization and agent self-learning, this
events in soccer video using on-screen texts. The proposed approach is completely automatic and independent to languages since it recommends the users to query events by keywords in image-form which are agents of clusters of stationary on-screen textboxes which are localized and extracted properly by a novel mechanism
mechanisms with a traditional indexing method. The goal is to identify a higher semantic content and more meaningful keyword combinations, considering both supervised and unsupervised techniques. Within a specific implementation both Bayesian learning as well as clustering are integrated to support a boost parameter towards
Enterprise-scale search engines are generally designed for linear text. Linear text is suboptimal for audio search, where accuracy can be significantly improved if the search includes alternate recognition candidates, commonly represented as word lattices. We propose two methods to enable text indexers to approximately index lattices with little or no code change: "TMI" (Time-based Merging...
in both Thai and English is built for helping users from a lot of keywords of the same term and (3) a set of keywords from herbal usages can be combined with the name keyword. From the results, information collected from KUIHerb is useful for searching.
taken into account when indexing documents and when performing searching. Utilizing this approach, it is possible to use a natural language to express user queries. In many cases, this way is more usual for users to describe their information needs compared to the keyword style. The factoid question answering task is one
We present a simple yet novel technique for prefix search in P2P networks. The idea is to extract characters and their position information in a keyword to index objects. Our index scheme can achieve quite balanced loads, avoid hop-spots and single point of failure, reduce storage and maintenance costs, and offer some
In this paper, we present a method we implemented to help a user index documents (and, in particular, learning objects) according to a given set of concepts (terms referring to domains or topics). The user first associates keywords to the concepts. Our method uses such associations to suggest simple rules for indexing
Topic tracking is to track trend of news topic, which people are interested in. It is a very pragmatic method in information retrieval. Compared with keywords retrieval, topic tracking excels in dynamic tracking based on text model and its content understanding, so it is mostly involved in text expressing and semantic
Web page classification plays an essential role in facilitating more efficient information retrieval and information processing. Conventionally, web text documents are represented by term frequency matrix for classification purpose. However, considering the limitations of representing documents using terms or keywords
index texts. Traditional BOW matrix is replaced by ldquoBag of Conceptsrdquo (BOC). For this purpose, we developed fully automated methods for mapping keywords to their corresponding ontology concepts. Support vector machine a successful machine learning technique is used for classification. Experimental results shows that
term-by-document matrix, it inevitably loses the information of relations between query terms in the document in the first place. This paper presents a modified vector space model for measuring similarity between the query and the document when responding to a multi-term query. More weight is assigned to the keywords
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.