The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
PASCAL VOC Segmentation Challenge [10] is currently considered as one of the datasets that reflect the image segmentation difficulties for real world scenarios [29]. However, current evaluation is simply based on a single Inter-section Over Union (IOU) score. In this paper, we try to discover the error factors under the IOU, which makes the results more informative to understand rather than a black...
Nowadays, with enhancing possibilities of the Internet usage, the number of its users grows as well. People use it more and more to communicate among themselves. This kind of communication plays a significant role in the decision-making process. Based on this finding, a need to analyze the content of the ample web discussions (so-called conversational content) using the computers arose. Therefore,...
Social media data consists of feedback, critiques and other comments that are posted online by internet users. Collectively, these comments may reflect sentiments that are sometimes not captured in traditional data collection methods such as administering a survey questionnaire. Thus, social media data offers a rich source of information, which can be adequately analyzed and understood. In this paper,...
With the advances in communication and technologies, the World Wide Web is becoming an important and rich source for information. The amount and variety of information available makes customization and personalized recommendations of utter importance. In this paper, we present a framework for the next page prediction that exploits users' access history combined with his semantic interests to generate...
In the past few years, instant messaging (IM) has been widely used in daily communication. However, due to the dispersion of topics and meaningless chatting, online IM groups are filled with useless messages. In order to help IM users capture what the IM group is talking about without reading all the messages, topic discovery in instant messages becomes a significant but challenging research task...
In this paper, we address the problem of automatic speech summarization on open-domain TED talks. The large vocabulary and diversity of topics from speaker-to-speaker presents significant difficulties. The challenges increase not only how to handle disfluencies and fillers, but also how to extract topic-related meaningful messages within the free talks. Here, we propose to incorporate semantic and...
This paper presents an efficient image exploration scheme for the unshaped object using semantic modelling. The local regions of an image have been classified with respect to the frequency of occurrences. The semantic concept is evaluated using RGB histogram dissimilarity factor, overall dissimilarity factor and regional dissimilarity factor. The dissimilarities determine the local concept with accuracy...
We introduce a novel method, named S-CRP (Segmentation based on distance dependent Chinese Restaurant Process), to segment broadcast sports videos into semantic shots. S-CRP employs distance dependent Chinese Restaurant Process (DCRP) using two segmentation criteria, namely appearance and time distances. It takes advantage of the customer (frame) assignments in DCRP and is able to reduce the negative...
Sentic Net is a popular resource for concept-level sentiment analysis. Because Sentic Net was created specifically for opinion mining in English language, however, its localization can be very laborious. In this work, a toolkit for creating non-English versions of Sentic Net in a time- and cost-effective way is proposed. This is achieved by exploiting online facilities such as Web dictionaries and...
With the growth of the Internet community, textual data has proven to be the main tool of communication in human-machine and human-human interaction. This communication is constantly evolving towards the goal of making it as human and real as possible. One way of humanizing such interaction is to provide a framework that can recognize the emotions present in the communication or the emotions of the...
Cross-domain learning is a very promising technique to improve classification in the target (testing) domain whose data distributions are very different from the source (training) domain. Many cross-domain text classification methods are built on topic modeling approaches. However, topic model methods are unsupervised in nature without fully utilizing the label information of the source domain. In...
Humans interact with each other using different communication modalities including speech, gestures and written documents. In the absence of one modality or presence of a noisy modality, other modalities can benefit precision of systems. HCI systems can also benefit from these multimodal communication models for different machine learning tasks. The provision of multiple modalities is motivated by...
Work on training semantic slot labellers for use in Natural Language Processing applications has typically either relied on large amounts of labelled input data, or has assumed entirely unlabelled inputs. The former technique tends to be costly to apply, while the latter is often not as accurate as its supervised counterpart. Here, we present a semi-supervised learning approach that automatically...
Computer-aided diagnosis systems can provide additional opinions that serve as an aid to radiologists in the early detection of lung nodules. Previous CAD models have relied on radiologist-delineated contours to extract image features and classify lung nodules into semantic ratings. Manually creating these contours can be time-consuming and expensive. This paper proposes a different CAD system based...
Traditional support vector machine treats all samples using the same weight. Therefore it is very sensitive to noisy data. While the fuzzy support vector machine assigns lower weights to the samples which make small contributions to classification, thus it is beneficial to reduce the effects of noisy and unimportant data on the classification accuracy rate. In this paper, we propose a novel fuzzy...
Spreadsheets are by far the most used programs that are written by end-users. They often build the basis for decisions in companies and governmental organizations and therefore they have a high impact on our daily life. Ensuring correctness of spreadsheets is thus an important task. But what happens after detecting a faulty behavior? This question has not been sufficiently answered. Therefore, we...
Abstract-Image annotation has been identified to be a suitable means by which the semantic gap which has made the accuracy of Content-based image retrieval unsatisfactory be eliminated. However existing methods of automatic annotation of images depends on supervised learning, which can be difficult to implement due to the need for manually annotated training samples which are not always readily available...
Existing personalized recommendation systems are facing many problems such as cold start, data sparseness and high complexity. Users' interests exist more widely and are more personalized compared with purchasing history in traditional recommendation systems. Thus, applying the interest graph in the recommendation process can make up certain shortages. This paper builds the mechanism of a user-interest-goods...
Since Nagoya protocol is entering into force, it has become very important issue to find alternative herb with similar efficacy. To find out alternative herbs, we adopted MeSH which contains semantic information derived from papers in the MEDLINE database which covers the medical, pharmaceutical, and biological worldwide research. Among 16 categories of MeSH, we chose 3 categories which are related...
Classification is a technique in data mining for categorizing objects. Text Classification is re-challenged for classifying very short documents or text as shown in social media collection. This paper proposes a method to improve the performance of classification on short documents. In this work, we expand words in every document before the documents are classified We use TFIDF model, Hidden Markov...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.