The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Nowadays, along with the development of information technologies, storage and analysis of biomedical datasets are easy in health sector. In this area, Machine Learning methods provide a great contribution for evaluation and interpretation of data. In this paper, in addition to Support Vector Machines, Decision Tree, K-Nearest Neighbors, Naive Bayes and Dictionary Learning methods, Random Feature Subspaces...
Authorship analysis deals with the identification of authors which is a problem of text data mining and classification. There are numerous techniques and algorithms that have been published so far, in the field of stylometry. In this regard, the primary objective of the present review is to provide the status of the different studies carried out on authorship analysis based on the important research...
Considering the fact that the underlying structural information in the training data within classes is vital for a good classifier in real-world classification problems, Structural Nonparallel Support Vector Machine (or SNPSVM, for short) has been proposed. By combining the structural information with nonparallel support vector machine (NPSVM), SNPSVM can fully exploit prior knowledge to directly...
While addressing real-world issues, there is a significant quantity of domain knowledge available in prior which helps in yielding different perspectives on various characteristics related to the issue. At the same time, several types of machine learning methods do not depend on such prior explicitly expressed domain information. However, such methods require especially in case of operating learning...
Data streams are rapidly and constantly growing. Analysis of rapidly changing data streams is quite difficult since the amount of data increases in timely manner. Individual patient records provide a vital resource for health research for the benefit of society, such as understanding the association between human immune system and viruses. As the patient records have been constantly growing, data...
Nowadays, owing to the growth of quantity of data, the data mining techniques have been required on web exceedingly for extracting information from the data. Classification of text in data mining is very important and has been a hot issue on the topic. Especially, ontological taxonomy classification is important for more intelligent information reasoning. As it relates to data distribution of classes...
Cardiovascular risk prediction is a vital aspect of personalized health care. In this study, retinal vascular function is assessed in asymptomatic participants who are classified into risk groups based on Framingham Risk Score. Feature selection, oversampling and state-of-the-art classification methods are applied to provide a sound individual risk prediction based on Retinal Vessel Analysis (RVA)...
Feature-based opinion mining for product review is the field of study that analyzes user's attitude towards product attributes, which has been witnessed a booming interest in the last one and half decades, due to its importance to business and society as a whole. This paper proposed a POS patterns matching method to identify feature words, opinion bearing words, as well as negative words based on...
In this paper we present a study on music mood classification by using only lyrics information. Specially considering the Chinese songs, the Chinese word-segmentation has caused intolerable errors and inadequate use of lyrics information. Our work proposes to use bag-of-character features instead of bag-of-word features to avoid the word segmentation error, which makes the classification more inaccurate...
Recently, a novel nonparallel support vector machine(NPSVM) is proposed by Tian et al, which has several attracting advantages over its predecessors. A sequential minimal optimization algorithm(SMO) has already been provided to solve the dual form of NPSVM. Different from the existing work, we present a new strategy to solve the primal form of NPSVM in this paper. Our algorithm is designed in the...
Since past few years there is tremendous advancement in electronic commerce technology, and the use of credit cards has increased dramatically. As credit card becomes the most popular mode of payment for both online as well as regular purchase, cases of fraud associated with it are also rising. In this paper the authors present the underlying theory of a hybrid model of an Intelligent Fraudulent Detection...
Multiclass classification is the task of classifying the samples into more than two classes. Generally multi-classifiers face difficulty in classifying samples those are very close to the separating hyperplane, known as Generalization error. Generalization error can be reduced by maximizing the margin of the separating hyperplanes. Support Vector Machine (SVM) is a maximum-margin classifier, its aim...
Earthquakes are what happens when immediate vibrations which shake earth surface, spread as waves as a result of earth crust cracks. Earthquakes depend on variables such as the way of spreading of these waves, calculation of these waves and calculating methods, evaluations of these recorded data sets. Predicting probable earthquakes and minimizing the damages are the important factors. Decision systems...
Sentiment analysis refers to the automatic extraction of sentiments from a natural language text. We study the effect of subjectivity-based features on sentiment classification on two lexicons and also propose new subjectivity-based features for sentiment classification. The subjectivity-based features we experiment with are based on the average word polarity and the new features that we propose are...
In contemporary society, an increasing number of people are involved in the biomedical research. However there is still a large amount of biological knowledge in the various unstructured documents so that it is difficult to analyze biological data. How to identify biological terms effectively from text is one of the important problems in the area of bioinformatics. Nowadays the precision of the best...
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. There are two main issues relevant to Web data extraction, namely wrapper generation and wrapper maintenance. In this paper, we propose a novel approach to the problem of automatic wrapper maintenance. It is based on the truth that despite various page changes,...
The problem of privacy-preserving data mining has become more and more important in recent years. Many successful and efficient techniques have been developed. However, in collaborative data analysis, part of the datasets may come from different data owners and may be processed using different data distortion methods. Thus, combinations of datasets processed using different methods are of practical...
Suppose that we are interested in classifying n points in a z-dimensional space into two groups having response 1 and response 0 as the target variable. In some real data cases in customer classification, it is difficult to discriminate the favorable customers showing response 1 from others because many response 1 points and 0 points are closely located. In such a case, to find the denser regions...
Hierarchical taxonomies are used to organize and retrieve information in many domains, especially those dealing with large and rapidly growing amounts of information. In many of these domains data also tends to be multi-label in nature. In this paper, we consider the problem of automated text classification in these scenarios. We present a post-processing based approach that performs smoothing on...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.