The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the explosion of Web 2.0, customers are able to share their opinions and sentiments online. This has led to new opportunities for companies and organizations to understand people's opinions towards their products or services and can serve to improve their products or market strategy more effectively. However, the data on the Web is huge and unstructured, which makes it difficult to analyze automatically...
This paper addresses the task of political orientation prediction and assigning a person to one of ‘democrat’ or ‘republican’ classes based on Twitter data that is produced by republicans and democrat voters. We have used Long Short Term Memory Recurrent Neural Networks and Support Vector Machine, algorithms to model the classification process. Long Short Term Memory Recurrent Neural Networks performed...
In this research, we propose a particular version of KNN (K Nearest Neighbor) where the similarity between feature vectors is computed considering the similarity among attributes or features as well as one among values. The task of text summarization is viewed into the binary classification task where each paragraph or sentence is classified into the essence or non-essence, and in previous works,...
In this research, we propose the version of K Nearest Neighbor which considers similarity among attributes for computing the similarity between feature vectors. The text segmentation task is viewed into the binary classification where each pair of sentences or paragraphs is classified into whether we put the boundary or not, and the proposed version resulted in the successful results in previous works...
Automatic classification of news articles is a relevant problem due to the large amount of news generated every day, so it is crucial that these news are classified to allow for users to access to information of interest quickly and effectively. On the one hand, traditional classification systems represent documents as bag-of-words (BoW), which are oblivious to two problems of language: synonymy and...
In this paper, we develop and implement a new email spamming system leveraged by coupled text similarity analysis on user preference and a virtual meta-layer user-based email network, we take the social networks or campus LAN networks as the spam social network scenario. Fewer current practices exploit social networking initiatives to assist in spam filtering. Social network has essentially a large...
Object semantic reduces the semantic gap in Content Based Image Retrieval (CBIR). In recent years, numerous methods for object semantic categorization have been proposed. Semantic segmentation is a key factor affecting the accuracy of object semantic categorization. The existing semantic segmentation methods usually chose pixel or super-pixel as the processing input. But the information contained...
Sentiment Classification refers to the computational techniques for categorizing whether the sentiments of a text are positive or negative. Sentiment Classification approaches would suffer due to Negation Modifiers. Negation Modifiers, like word “not” modify the meaning of the associated word. Handling Negation Modifier is important as they may modify the Sentiment conveyed by the associated word...
This paper presents a model for classifying ICD-10 TM using machine learning and information retrieval. The scope of this research take systematic approach for translating diagnosis from medical records to ICD-10 TM is proposed. First, an information retrieval is used to find similarity word in Thai and English diagnose. Then, machine learning approach is applied to classify ICD-10 TM by training...
Identifying the sentiment of the text has recently gained a lot of popularity probably due to availability of huge datasets, especially on social networking sites of internet. The social networking sites like twitter and facebook provides to people in general the effective platform for expression of their thoughts and ideas. These thoughts can be harnessed for extraction of sentiments of people related...
Question Classification is a vital component of Question Answering System. In this paper we have proposed a compact and effective method for question classification. Here rather than using a two layered taxonomy of 6 course grain and 50 fine grained categories developed by Li and Roth, 2002, we have classified the questions into three broad categories. We have also studied the syntactic structure...
Microblog has become a daily communication tool in recent years. Researches on microblog have drawn more and more attention. Microblogging emotional classification is a major research of user intent analysis based on User-Generated Content (UGC). This paper focuses on the discrimination on two emotional tendencies: positive and negative. Firstly, the system cleared the noisy elements in the microblog,...
Exploiting label structures or label correlations is an important issue in multi-label learning, because taking into account such structures when learning can lead to improved predictive performance and time complexity. In this paper, a multi-label lazy learning approach based on k-nearest neighbor and latent semantics is presented, which is called LsKNN. Firstly, latent semantic analysis is applied...
Nowadays, Content-Based Image Retrieval (CBIR) is the mainstay of image retrieval systems. To understand the query semantics and users' expectations so as to communicate faithful results in terms of accuracy, Relevance Feedback (RF) was incorporated to CBIR systems. By allowing the user to assess iteratively the answers as relevant/irrelevant or even giving him/her the opportunity to specify a degree...
Research of sentence orientation is aim to obtain the useful orientation information, it becomes a research focus in the nature language processing, especially in Micro-blog. Based on the existed How Net semantic similarity, this paper presents a sentence orientation identification method taking advantage of an improved algorithm for calculating Chinese term semantic orientation value. Firstly, this...
With the development of Web Service Technology, the quantity of the web services published on the Internet is increasing rapidly. Recognizing each web service intelligently becomes the key of efficiently using Internet. And the first step of recognization is to classify the web services accurately. To classify a huge amount of web services becomes a difficulty job. Therefore, in order to support applications...
Feature-based opinion mining is an interesting opinion mining issue. For this problem, feature words/phrases are discovered at sentence level. However, customers usually use different words/phrases referring to the same feature in reviews. To produce a meaningful summary, synonym feature words/phrases in domain, need to be grouped under the same feature. This paper proposes a solution for grouping...
In the medical field, a lot of unstructured information which is expressed by natural language exists in medical literature, technical documentation and medical records. IE (Information Extraction) as one of the most important research directions in natural language process aims to help humans extract concerned information automatically. NER (Named Entity Recognition) is one of the subsystems of IE...
This paper presents an algorithm for automatic detection of the orientation of user generated images. The images can initially be into 3 different orientations. The algorithm utilizes SVM classifier trained over feature vectors of the low-level characteristics of the images in the training set. In order to increase classification accuracy, prior to the SVM classification, the images are hierarchically...
We tackle the challenge of web image classification using additional tags information. Unlike traditional methods that only use the combination of several low-level features, we try to use semantic concepts to represent images and corresponding tags. At first, we extract the latent topic information by probabilistic latent semantic analysis (pLSA) algorithm, and then use multi-label multiple kernel...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.