The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes a novel ensemble method to improve the performance of binary classification. The proposed method is a non-linear combination of base models and an application of adaptive selection of the most suitable model for each data instance. Ensemble methods, an important type of machine learning technique, have drawn a lot of attention in both academic research and practical applications,...
The SVM can realize data classification and prediction, the selection of penalty parameter c and kernel function g in training models directly affect the forecasting accuracy of the classification, the article use the K-CV method for c, g parameters optimization and processing, in wine species identification as an example to predict classification, improves the forecast accuracy, has reached the expected...
The idea behind this tool is to make a generic classifier which can be used to diagnose patients suffering from brain disorders. We have created a tool which uses machine learning algorithms from Weka, Caret and Scikit Learn from Java, R and Python respectively and combines the three packages into one R package which provides the functionality of classifying the patients suffering from brain disorders...
Student classification is one of the popular educational data mining tasks to early predict in-trouble students in an educational system for appropriate and timely support. Besides, an academic credit system is nowadays widely-used all over the world due to its flexibility in order that teaching and learning activities can be efficiently conducted. Nevertheless, its flexibility might lead to the heterogeneity...
Nowadays there are many risks related to bank loans, especially for the banks so as to reduce their capital loss. The analysis of risks and assessment of default becomes crucial thereafter. Banks hold huge volumes of customer behaviour related data from which they are unable to arrive at a judgement if an applicant can be defaulter or not. Data Mining is a promising area of data analysis which aims...
Recently, establishment and maintenance of the tax assessment indicators system is still in the stage of manual operation. The accuracy of tax assessment depends on the officials' judgment and analysis which bring them huge amount of work. Furthermore, the evaluation results are affected by manual factors and not reliable. To improve tax assessment, this paper proposes a tax assessment model based...
We propose a new learning algorithm of latent local support vector machines (SVM), called Latent-lSVM for effectively classifying very-high-dimensional, large-scale multiclass image datasets. The common framework of image classification tasks using the Scale-Invariant Feature Transform method (SIFT), the Bag-of-visual-Words (BoW), leads to hard classification problem with thousands of dimensions,...
In the evolving technology of big data, high velocity data streams play a vital role since pattern of data is being changed over time. The temporal pattern change in data stream leads to a concept evolution called concept drift where statistical properties of data differs from time to time and the drift is taken into account in order to update old and outdated classifier and make it adaptable to new...
Active learning aims to selectively label the most informative examples to save the data collection cost. While active learning has been well studied for balanced classification problems, limited research is performed in cost-sensitive scenario. In this paper, we investigate the problem of active learning for cost-sensitive classification. We first propose a general active learning framework named...
There exists a base classification system for classification of problem tickets in the Enterprise domain. Different deep learning algorithms (Gated Recursive Unit and Long Short Term Memory) were investigated for solving the classification problem. Experiments were conducted for different parameters and layers for these algorithms. Paper brings out the architectures tried, results obtained, our conclusions...
Nowadays, with the development of technology and social media, people's opinions and news about a particular product or currency is significantly expanded. In addition to expanding volumes of text data, the unstructured characteristics of them makes the analysis of these types of data with the vital challenges. In this study, the proposed method utilized different Training Set Selection (TSS) approaches...
Machine based systems can't keep up with the task of organizing the data in an up-to-date manner unless and until the data acquired is being planned or scheduled and managed in an appropriate manner. Today's datasets start as small chunk of information and grow exponentially over a period of time. Once the size is extremely large it becomes difficult to make decisions and to predict consistently and...
Classification is a central problem in the fields of data mining and machine learning. Using a training set of labelled instances, the task is to build a model (classifier) that can be used to predict the class of new unlabelled instances. Data preparation is crucial to the data mining process, and its focus is to improve the fitness of the training data for the learning algorithms to produce more...
To help telecommunications operators accurately predict the terminal replacement behavior, and improve the success rate of marketing and the accuracy of resources devoting, huge user consumption data are used to build Deep Belief Network. The deep features that characterize the terminal replacement behavior are learned, through which a terminal replacement prediction model is conducted. Experiments...
Intrusion Detection System have been successful to prevent attacks on network resources, but the problem is that they are not adaptable in cases where new attacks are made i.e. they need human intervention for investigating new attacks. This paper proposes the creation of predictive intrusion detection model that is based on usage of classification techniques such as decision tree and Bayesian techniques...
It is difficult to get satisfactory churn prediction results by traditional models, because the available customer samples in target domain are usually few and the class distribution of customer data is imbalanced. This study proposes a group method of data handling (GMDH) based dynamic transfer ensemble (GDTE) model for churn pre-diction. It first transfers the data in related source domains to the...
Research has shown that environment lighting influences the behavior of the employees in an office setting highly, making lighting configuration in an office space crucial. A breakout area may be used by the employees for various activities that need to be supported by different lighting conditions, e.g. informal meetings or personal retreat. The desired lighting conditions depend on user preferences...
Consensus method is a means of communication between experts who assist the formation of a group judgment. This technique has great potential to be adopted to provide prediction based on outcomes obtained from several classification algorithms. In this study, data on water consumption was used to induce the classification model that will be used to predict the possibility of the occurrence of water...
This article introduces a new approach of application of multi-label classification methods to type selection of mobile base-station. In order to construct forecasting model of type selection of mobile base-station, we need to carry on the application of BR, which is one of multi-label classification methods, to automatic type selection on the basis of historical data of base-station construction...
The ultimate goal in a multiple classifier system (MCS) is to obtain a global and more accurate model through the combination of several base learners. Among the popular combining rules, averaging has been emphasized as a well qualified option. The averaging rule can be applied with equal (simple averaging) or non-equal (weighted averaging) weights vector for the linear combination. When the formed...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.