The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper provide a brief survey of semantic similarity including semantic similarity between concepts and semantic textual similarity. We classify methods of semantic similarity between into four categories based on background information resource used and classify methods of semantic textual similarity into four categories too. As a basic methodology of text related research and applications, semantic...
Wikipedia infoboxes serve as important structured information source in the web. To author infobox for a particular article, volunteers required a considerable amount of manual effort to identify the respective infobox template. Thus, an automatic process to mark infobox template might be useful and beneficial for the Wikipedia contributors. In this paper, we present a Natural Language Processing...
Text summarization is a way to condense the largeamount of information into a concise form by the process ofselection of important information and discarding unimportantand redundant information. The need for Text summarizationhas increased much due to the abundance of documents inthe internet. Even though a lot of text summarization systemshave been developed for summarizing documents in variouslanguages,...
The paper addresses the phenomenon of direct object defaults in text as part of exploring the meaning of the unsaid and making it accessible for computer understanding. It describes a large but reasonably simple computer experiment on the basis of one hypothesis about defaults, namely, that a true default may appear as a direct object only if modified, e.g., Bob ate fresh food but? Bob ate food. The...
Object Recognition and clustering are major techniques in Pattern Recognition, Computer Vision, Artificial Intelligence and Robotics. Conventionally these techniques are implemented in Visual-Feature based methods and Cosine Similarity method or Vector Space method which uses semantic similarity among the objects to solve these kinds of problem, but this method has two problems synonymy and polysemy...
Recent development of variety and volume of information circulating in the Internet has prompted the emergence of a new paradigm in information extraction, namely the Open Information Extraction (Open IE). An evaluation of several existing Open IE systems shows a good performance on precision. However, improvement is still needed to boost the recall. A relation between entity pair in simple sentence...
A troll is a user intent on sowing discord on the internet. We propose an approach to detect such users from the sentiment of the textual content in online forums. Since trolls typically express negative sentiments in their posts, we derive features from sentiment analysis, and use SVMrank to do binary and ordinal classification of trolls. With a small labeled training set of 20 users, we achieved...
As a means to share knowledge, the community question answering (CQA) service provides users a chance to obtain or provide help by raising or answering questions. After a question is posted, the system must find an appropriate individual to answer this question. Several approaches have recently been proposed to find experts in CQA. In this paper, a new method to find experts in CQA is proposed by...
Wikipedia pages associate the key words and phrases from the text with the other Wikipedia pages by linking them together for the purpose of enabling the user to reach the information in an easier way. In this study, by using the Natural Language Processing techniques, the linking system has been tried to be automatized. Initially, the approach has been designed for Turkish Wikipedia, then in the...
Many researchers have recognized Wikipedia as a resource of huge dynamic knowledge base in recent years. This paper provides a new approach for obtaining measures of terms semantic relatedness, which maps terms to relevant Wikipedia articles as the background information for analyzing. The proposed algorithm WLA focuses on the hyperlink structure and summary paragraph extracted from the topic pages...
To evaluate how much a pair of entities or documents are similar is a common problem for current applications. Most approaches for this problem are based on the co-occurrence. However, different terms or words may represent the same entity or similar semantic in the real world since a concept often has more than one way of expression. Existing works always focus on computing semantic relatedness of...
With the fast growth rate of information availability through the World Wide Web, search engines' ranking become limited to deal with such enormous amount of information. Web search engines should be enriched with methodologies that enable it to understand the content of Web pages, then to align pages to the correct query category that highly match its content. In this paper, a proposed system is...
Recently cloud systems have been spread widely through internet so that we can get to huge amount of complex information (mainly in text) easily and quickly. However we can hardly catch up with the changes inside since we should go through them and see quickly what's going on. This is why automatic labelling is indispensable. In this work, assuming concept hierarchy, we extract suitable labels for...
In this investigation, we introduce new kinds of sentence similarity, called Euclid similarity and Levenshtein similarity, to capture both word sequences and semantic aspects. This is especially useful for Semantic Textual Similarity (STS) so that we could retrieve SNS texts, short sentences or something including collocations. We show the usefulness of our approach by some experimental results.
Microblog service has attracted much attention in big data analysis. Twitter statistics remark that the average number of tweets per day is greater than 100 million and many thousands of them happen every minute. It's an urgent challenge to face with such a large amount of collected tweets. This work defines an online query-focused Twitter summarization framework. It crawls and semantically indexes...
In this paper, a soft+hard data fusion model is proposed that is capable of combining the data generated from human-based sources with those generated by physical sensors. The basis of this model is our previously introduced Fuzzy extension to the Mutli-Entity Bayesian Network (MEBN) language, which is a High-Level Information Fusion (HLIF) framework capable of expressing the semantic and causal relationships...
Computing technology consists of four elements: user devices, computer networks, servers and software. Change in any of the elements can cause the computing technology to move from one phase to another. So far computing technology has undergone five phases starting from centralized computing to ubiquitous computing. We are moving into the sixth phase of computing technology, namely the advanced ubiquitous...
It is a challenge to effectively organize web learning resources and to discovering such resources for facilitating automatic collaboration and problem solving in a web based learning context. We describe how to construct an intelligent learning network based on the semantic data model Learning Semantic Link Network (L-SLN) and propose a methodology to support emerging semantic learning in a semantic...
In this paper, we study the approach of user classification and service optimization for the file sharing application to P2P network. The user classification model is proposed which is based on user interest similarity and time feature similarity, and the method for determining user similarity is given. By analyzing the component features of user classification model, the results show that the model...
Now, cross-modal retrieval similarity on multimedia with texts and images have attracted scholars' more and more attention. The difficulty of cross-modal retrieval is how to effectively construct correlation between multi-modal heterogeneous data. According to canonical correlation analysis, most existing cross-modal methods embed the heterogeneous data into a joint abstraction space by linear projections...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.