The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose two methods to measure the semantic similarity for multi-lingual and short texts by using Wikipedia. In recent years, people around the world have been continuously generating information about their local area in their own languages on social networking services. Measuring the similarity between the texts is challenging because they are often short and written in various...
Sentence similarity methods are used to assess the degree of likelihood between phrases. Many natural language applications such as text summarization, information retrieval, text categorization, and machine translation employ measures of sentence similarity. The existing approaches for this problem represent sentences as vectors of bag of words or the syntactic information of the words in the phrase...
Correctly interpreting human instructions is the first step to human-robot interaction. Previous approaches to semantically parsing the instructions relied on large numbers of training examples with annotation to widely cover all words in a domain. Annotating large enough instructions with semantic forms needs exhaustive engineering efforts. Hence, we propose propagating the semantic lexicon to learn...
In this paper, we present an approach that extracts attributes of open-domain named entities for the Chinese language. The approach contains two steps. The first step consists in an unsupervised technique which captures high frequency attributes from online encyclopedias. The second step discovers uncommon attributes with low frequency. Lastly, an integrated framework is proposed to obtain attributes...
With the popularization and development of network technology worldwide, Web is acting as a platform for the diffusion and evolution of social events. However, faced with the huge, disorder and continuous web resources, it is impossible for people to efficiently recognize, collect and organize the events. Therefore, it becomes a hot research field that automatically collecting, organizing the information...
The quest for information in the contemporary world ends at search engines that crawl millions of web pages on the World Wide Web and it is clearly essential that the results should be ranked in an order that would best fit the user interests. This paper proposes a method of re-ranking the search results that have been primarily ranked using either conventional algorithms that use link structure and...
In this paper, a refined template selection algorithm was proposed for NR-based image feature extraction. The proposed method was devoted to reduce the computation complexity and to improve the recognition ability. Experimental results on several image databases strongly demonstrated the efficiency and effectiveness of the proposed method.
Today internet usage has seen tremendous growth. As English is the primary language, documents are mostly available in English language. In India, Hindi is the prevalent language and user wants to access data in Hindi. For the language processing we are required to get the exact sense of polysemous word interpreting the meaning in a particular context. To disambiguate the meaning of the polysemous...
This paper addresses the problem of learning semantic compact binary codes for efficient retrieval in large-scale image collections. Our contributions are three-fold. Firstly, we introduce semantic codes, of which each bit corresponds to an attribute that describes a property of an object (e.g. dogs have furry). Secondly, we propose to use matrix factorization (MF) to learn the semantic codes by encoding...
Semantic relations acquisition is a crucial work in the field of knowledge acquisition. This paper presents a method that acquires semantic relation patterns from large microblog text. It initially analyzes the characteristic of microblog text, and give an algorithm of acquire patterns semi-automatically. Semantic relations are extracted from microblog corpus based on concept recognition and pattern...
As the cloud computing popularizes, sensitive data need to be encrypted before outsourcing, which makes the traditional data utilization useless. So search over these encrypted documents is a very challenging task. This paper firstly descripts the framework of the searchable encryption, and then presents the existing keyword-based search technologies. At last, we propose a novel color-based keyword...
Recently, the information leaking problem of important a per-based documents has become more and more serious, which is caused by scanning, copying and photographing methods. To deal with it, great development has been made in the text watermarking algorithms based on special font library. However, all the current font libraries are created manually with professional softwares with very low efficiency...
There are two main strategies to tackle scene classification: holistic and semantic. The former characterizes a scene using its global features, while the latter represents a scene by modeling its internal object configuration. Holistic strategy is good at representing scenes with simple contents, but it does not represent well complex scenes that consist of multiple objects. By contrast, semantic...
Semantic Web services (SWs) paradigm is considered as the most dominant technology of the Service-Oriented Computing (SOC). SWs have emerged as a major technology for deploying automated interactions between distributed and heterogeneous applications. This computing technology can be used to discover new distributed and heterogeneous collaborative applications of large-scale distributed systems in...
We present JolokiaC++, an annotation based compiler framework which generates high quality CUDA (Compute Unified Device Architecture) code for GPUs. Our contributions include: (1) developing explicit and implicit annotations with illustrations of their use in C++, (2) showing the utility of these annotations by providing comparison code snippets, which demonstrates the ease of programming and performance...
In this paper, we present an accelerated knowledge-driven content-based information mining system for Big Earth Observation data fusion. The tool combines, at pixel level, the unsupervised clustering results of different number of features. The features, extracted from different EO raster image types and from existing GIS vector maps, are combined, in form of a BoW, with a user given semantic concepts...
Short text semantic similarity (STSS) measures are algorithms designed to compare short texts and return a level of similarity between them. However, until recently such measures have ignored perception or fuzzy based words (i.e. very hot, cold less cold) in calculations of both word and sentence similarity. Evaluation of such measures is usually achieved through the use of benchmark data sets comprising...
Ontologies have proven their utility in the area of Information Retrieval. However, building and updating ontologies manually is a long and tedious task. Moreover, crisp ontologies are not capable to support uncertain information. One interesting solution is to integrate fuzzy logic into ontology to handle vague and imprecise information. This paper presents a method for individual fuzzy ontology...
This paper proposes a music auto-tagging system based on probabilistic annotation of semantically meaningful tags with variable feature sets. The perception-related long-term features are extracted. The original features are selected by a combination algorithm of ReliefF and principle component analysis (PCA) to form a variable unique feature subset for each tag. The Gaussian mixture models (GMMs)...
In the past, several automatic video summarization systems had been proposed to generate video summary. However, a generic video summary that is generated based only on audio, visual and textual saliencies will not satisfy every user. This paper proposes a novel system for generating semantically meaningful personalized video summaries, which are tailored to the individual user's preferences over...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.