facility. Automating the transcription of these documents using Optical Character Recognition (OCR) systems is also challenging due to the very complex cursive nature of Urdu text. To overcome these limitations, a keyword spotting based information retrieval system for document images is introduced in this study. The proposed
Text analysis of a web page is more difficult than the analysis of the text of normal document due to the presence of additional information, such as HTML structure, styling codes, irrelevant text, and presence of hyperlinks. In this paper, we propose an unsupervised method to extract keywords from a web page. The
of NIJC (NI Japan Center) are being developed with one platform called, Visiome, already operating and publicly accessible at “http://www.platform.visiome.org”. Each of these platforms requires their own set of keywords that represent important terms covering their respective fields of study. One important function of
results of keyword search on social photos. Social photos are photos that are posted on social media sites and they usually include posted time and text message as well as photos. It is difficult to know about hot topics in results of keyword search on social photos, because, the huge number of results are returned and they
In this paper, we propose a novel image search scheme is contextual image search with keyword input. It is different from conventional image search schemes. it consist of three step process, first one is context extraction to distinguish the image entities of the same name, second step is conceptualization to convert
For each treatment plan, patient adherence can be managed, audited, and improved by the Patient Adherence Management System applying Intelligent Keyword (PAMSIK) featuring the use of intelligent keywords to navigate users to the target in-time knowledge and also leverage the collective power - peer learning to
This paper describes a system that conducts search result clustering for several thousands of Web pages, and elaborates cluster labels through keyword distillation. Keyword distillation is a method that properly handles spelling variations, transliterations, synonyms, inclusion relations and word ambiguity, using
This paper proposes an extended vector space model (VSM), which is called M2VSM (meta keyword-based modified VSM). When conventional VSM is applied to document clustering, it is difficult to adjust the granularity of cluster in terms of topic. In order to solve the problem, M2VSM considers meta keywords such as
Without formal structure data are those that have no prearranged form or structure and are full of textual data. Typical unstructured systems include emails, reports, telephone or messaging conversations, etc. The main goal of this work is to extract the keywords from a conversation using particle swarm optimization
Text classification is a useful task in text mining. Most researchers employ one word weight type in the text classification. Here, we proposed to build a keyword list by combining several word weights for a rule based multi label text classification. Through this research, we conducted experiments on the term
XML Keyword Search is a user-friendly information discovery technique, which is well-suited to schema-free XML documents. We propose a novel scheme for XML keyword search called XKLUSTER, in which a novel semantic-distance model is proposed to specify the set of nodes contained in a result. Based on this model, we
Internet is becoming an increasingly important platform for ordinary life and work. It is expected that keyword extraction can help people quickly find hot spots on the web, since keywords in a document provide important information about the content of the document. In this paper, we propose to use text clustering
In this paper, we address the issue of how to overview the knowledge of a given query keyword. We especially focus on concerns of those who search for Web pages with a given query keyword, and study how to efficiently overview the whole list of Web search information needs of a given query keyword. First, we collect
Query-recommendation systems based on inputted queries have become widespread. These services are effective if users cannot input relevant queries. However, the conventional systems do not take into consideration the relevance between recommended queries. This paper proposes a method of obtaining related queries and clustering them by using the history of query frequencies in query logs. We define...
We present a new fitness measure B W (D) for a text-document D against a set of keywords W. The fitness evaluation forms a basic operation in information retrieval. The measure B W (D) differs from other measures in that it accounts for both the frequency of the keywords and their
Keywords are indexed automatically for large-scale categorization corpora. Indexed keywords of more than 20 documents are selected as seed words, thus overcoming subjectivity of selecting seed words in clustering; at the same time, clustering is limited to particular category corpora and keywords indexed feature
We consider topic detection without any prior knowledge of category structure or possible categories. Keywords are extracted and clustered based on different similarity measures using the induced k-bisecting clustering algorithm. Evaluation on Wikipedia articles shows that clusters of keywords correlate strongly with
The behaviors of atomic species and nano-particles in the laser-ablated plume were investigated by two-dimensional laser-induced fluorescence (2D-LIF) and UV Rayleigh scattering (2D-RS) imaging techniques. The images observed in the ablation of silicon (Si) in a He ambient are presented for various processing conditions. The temporal behavior of Si 2 molecules, which were generated by the...
In this paper, we address the issue of how to overview the knowledge ofa given query keyword. We especially focus on concerns of those whosearch for Web pages with a given query keyword, and study how toefficiently overview the whole list of Web search information needs of agiven query keyword. First, we collect Web
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.