The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the Linked Data context, identity link is one of the most important semantic links that can be established between the datasets. It specifies that different identifiers refer to the same real world object and therefore must be linked. The process of detecting these identical instances across different data repositories is referred as instance matching. This is used to connect existing data sources...
This paper focuses on the problem of machine learning classifier choice for network intrusion detection, taking into consideration several ensemble classifiers from the supervised learning category. We have evaluated Bagged trees, AdaBoost, RUSBoost, LogitBoost and GentleBoost algorithms, provided an analysis of the performance of the classifiers and compared their learning capabilities, taking for...
Flight parameters record the flight state and performance of the each flight phase. The precise division of the aircraft flight process using flight parameters can not only perform the stage quality evaluation of the whole flight process, but also can detect the aircraft faults. In this paper, the decision tree classifier is used to divide the flight parameters. The parameter reduction is carried...
The pattern recognition in the sparse representation (SR) framework has been very successful. In this model, the test sample can be represented as a sparse linear combination of training samples by solving a norm-regularized least squares problem. However, the value of regularization parameter is always indiscriminating for the whole dictionary. To enhance the group concentration of the coefficients...
The design of effective financial early warning algorithm is of great significance to the financial management of the company. The weak classification algorithm can be improved to a high classification algorithm with high recognition rate through the ensemble learning. The algorithm can overcome the drawback of low classification accuracy of single classifier. Therefore, this paper combines decision...
This paper presents a novel adaptive resampling algorithm based on the clustering by fast search and find of density peaks (CFSFDP) algorithm and the synthetic minority oversampling technique (SMOTE), named DP-SMOTE. The essential idea of the proposed method is to use the improved CFSFDP algorithm to find the subclasses and removing noisy data automatically, and then to generate the minority samples...
Huge amount of data in today's world are stored in the form of electronic documents. Text mining is the process of extracting the information out of those textual documents. Text classification is the process of classifying text documents into fixed number of predefined classes. The application of text classification includes spam filtering, email routing, sentiment analysis, language identification...
The success of machine learning (ML) algorithms depends on the quality of data given to them. If the input data contains insufficient or irrelevant features, the accuracy of machine learning algorithm decreases. Attribute selection has a key role in creation of classification models. Based on the ‘logic behind the inference’ principle in the Nyaya school of thought, this paper proposes a new method...
Classification of different tumor type are of great significance in problems cancer prediction. Choosing the most relevant qualities from huge microarray expression is very important. It is a most explored subject in bioinformatics because of its hugeness to move forward humans understanding of inherent causing cancer mechanism. In this paper, we aim to classify leukaemia cells. Our approach relies...
For better resolving the safety risk early warning of the apron effectively, the attribute reduction algorithm based on Rough Set is used to simplify the set as the warning index set of the apron is too large. The improved Particle Swarm Optimization (PSO) algorithm is used to optimize the parameters of Support Vector Machine. Combined with the Rough Set and SVM which is optimized by the improved...
The application of various statistical machine learning methods for the identification of bi-heterocyclic drugs that are based on the THz spectra is presented. A comparison of classification efficiency with six algorithms (LDA, QDA, SVM, Naive Bayes, KNN with Euclidean metrics and the cosine similarity) is shown and a complete THz system allowing for the identification of drugs with an efficiency...
This study investigates the performance of the Multilayer Perceptron (MLP) classifier in discriminating the qualities of agarwood oil significant compounds by different qualities based on three training algorithms namely Scaled Conjugate Gradient (SCG), Levernbergh-Marquardt (LM) and Resilient Backpropagation (RP) Neural Network by using Matlab version 2013a. The dataset used in this study were obtained...
The aim is to develop an efficient method which uses a custom image to train the classifier. This OCR extract distinct features from the input image for classifying its contents as characters specifically letters and digits. Input to the system is digital images containing the patterns to be classified. The analysis and recognition of the patterns in images are becoming more complex, yet easy with...
Human activity recognition using wearable sensors plays a significant role in many applications. How to accurately and quickly recognize various activities based on wearable sensors draws more and more attentions. This paper proposes an accelerated sparse representation classification method based on random projection and k-nearest neighbor for human activity recognition. Random projection is first...
Seeing the public of Bandung city as an active social media user, Bandung government provides channel in Twitter for citizen to report their complaints. In order to make the citizen complaint monitoring easier, there is a need to automatically detect the topics of complaint tweets (written in Indonesian language) in order to assist the government in managing the complaints reported. In this paper,...
This paper studies the simultaneous fault diagnosis of the main reducer in the automobile transmission system assembly based on vibration signals. A simultaneous fault diagnosis model based on Paired Relevance Vector Machine (Paired-RVM) is proposed for the simultaneous fault of the main reducer, and each binary sub-classifier is trained with single fault samples and then fused by a pairing strategy...
Twitter is one of world most famous social media. There are many statement expresed in Twitter like happiness, sadness, public information, etc. Unfortunately, people may got angry to each other and write it down as a tweet on Twitter. Some tweet may contain Indonesian swear words. It's serious problem because many Indonesians may not tolerated swear words. Some Indonesian swear words may have multiple...
This work describes a computer aided diagnostic tool for EEG signal classification and analysis. Our main objective is to develop an accurate, automatic and timely classification method for detection of seizures occurring in epileptic patients so that appropriate medical attention can be preemptively provided to the patient The proposed method employs a feature extractor coupled with K-Nearest Neighbor...
For the detection of human activities using motion data many techniques employ feature extraction and machine learning. But detection rates and incorrect classification rates require further increase and decrease, respectively. We address both the problems. We propose a novel distance measure, called log-sum distance, for evaluating difference between two sequences of positive numbers. We use the...
Based on the spectral data from SDSS, Kernel Support Vector Machines (K-SVM) is applied to classify quasars from other celestial body. Firstly, the basic theory of the SVM(Support Vector Machine) with relaxation factor and kernel function is introduced. Then, the main parameters are designed and selected. Finally, the method is applied to the classification and identification of the quasars. The classification...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.