The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Both M81 and B95-8 are distinct strains belonged to Epstein-Barr Virus (EBV). However, as M81's target cell is epithelial cell and that of B95-8 is B cell, and common EBV vaccine only causes an effect only on B95-8, we can recognize that these two EBVs have different characteristics. In this paper, we analyzed DNA sequence using three algorithms: Apriori, Decision Tree, and support vector machine...
Indonesia have a massive number of SMEs, but with a very low revenue. An alternative to increase revenue is by using internet. Some SMEs already develop their website, but they don't have same navigation. The websites confuse the potential buyers. So, a website's aggregator is essential. This aggregator is made without the owner of the SMEs to register their website, which means it can automatically...
Large amount of medical data leads to the need of intelligent data mining tools in order to extract useful knowledge. Researchers have been using several statistical analysis and data mining techniques to improve the disease diagnosis accuracy in medical healthcare. Heart disease is considered as the leading cause of deaths worldwide over the past 10 years. Several researchers have introduced different...
Malware family identification is a complex process involving extraction of distinctive characteristics from a set of malware samples. Malware authors employ various techniques to prevent the identification of unique characteristics of their programs, such as, encryption and obfuscation. In this paper, we present n-gram based sequential features extracted from content of the files. N-grams are extracted...
Support vector machine is a state-of-the-art learning machine that is used in areas, such as pattern recognition, computer vision, data mining and bioinformatics. SVMs were originally developed for solving binary classification problems, but binary SVMs have also been extended to solve the problem of multi-class pattern classification. There are different techniques employed by SVMs to tackle multi-class...
Computational intelligence techniques are proved to be outperforming compared to standard statistical techniques, specifically when dealing with large, unbalanced and high dimensional data. In this paper we present an enhancement approach for improving the performance of decision tree using Support Vector Machine (SVM) when dealing with unbalanced data. The proposed approach modifies the available...
In all areas of engineering, modelers are constantly pushing for more accurate models and their goal is generally achieved with increasingly complex, data-mining-based black-box models. On the other hand, model users which include policy makers and systems operators tend to favor transparent, interpretable models not only for predictive decision-making but also for after-the-fact auditing and forensic...
Support Vector Machines are the state-of-the-art tools in data mining. However, their strength are also their main weakness, as the generated nonlinear models are typically regarded as incomprehensible black-box models. Therefore, opening the black-boxor making SVMs explainable became more important and necessary in areas such as medical diagnosis and credit evaluation. Rule extraction from SVMs,...
Dataset used in financial distress prediction is unbalanced. The traditional machine learning method such as neural network and support vector machine is premise with the hypothesis that the class distribution is basically balanced. The classification of unbalanced dataset inclines to the relative majority samples results in the lower identification of the minority while the conventional down-sampling...
Cryptanalysis attempts identify the weaknesses in the algorithms used to encrypt code or the methods used to generate keys. In this study, we use pattern recognition techniques for identification of encryption algorithms for block ciphers. The following block cipher algorithms, DES, IDEA, AES, and RC operating in ECB mode were considered. Eight different classification techniques which are: Naïve...
Rising of computer violence, such as Distributed Denial of Service (DDoS), web vandalism, and cyber bullying are becoming more serious issues when they are politically motivated and intentionally conducted to generate fear in society. These kinds of activity are categorized as cyber terrorism. As the number of such cases increase, the availability of information regarding these actions is required...
The increase of malware that are exploiting the Internet daily has become a serious threat. The manual heuristic inspection of malware analysis is no longer considered effective and efficient compared against the high spreading rate of malware. Hence, automated behavior-based malware detection using machine learning techniques is considered a profound solution. The behavior of each malware on an emulated...
The county level of basic public services analysis and classification play an important role in county economic growth and improve benefit of healthy development of urbanization in China. According to the county level of basic public services data which is large scale and imbalance, this paper presented a support vector machine model to classify the county level of basic public services. The method...
Individual credit risk evaluation is an important and challenging data mining problem in financial analysis domain. This paper compares the effectiveness of four data mining algorithms - logistic regression (LR), decision tree (C4.5), support vector machine (SVM) and neural networks (NN) by applying them to two credit data sets. Experiment results show that the LR and SVM algorithms produced the best...
Total Order Broadcast (TOB) is a fundamental building block at the core of a number of strongly consistent, fault-tolerant replication schemes. While it is widely known that the performance of existing TOB algorithms varies greatly depending on the workload and deployment scenarios, the problem of how to forecast their performance in realistic settings is, at current date, still largely unexplored...
In systems with strong seasonal difference in vegetation structure and appearance, multi-temporal imagery can be particularly useful for community- and species-level discrimination. And, since the availability of past data for one source of time series images may be limited, so we need to develop multi-temporal and multi-source method for wetland ecosystem monitoring. To perform this type of analysis,...
The paper compares the classification performance rate of eight models: logistic regression (LR), neural network (NN), radial basis function neural network (RBFNN), support vector machine (SVM), case-base reasoning (CBR), and three decision trees (DTs). We build models and test their classification accuracy rates on a historical data set provided by a German financial institution. The data set contains...
In this paper, a novel classification approach is presented. This approach uses fuzzy if-then rules for classification task and employs a hybrid optimization method to improve the accuracy and comprehensibility of obtained outcome. The mentioned optimization method has been formulated by simulated annealing and genetic algorithm. In fact, the genetic operators have been used as perturb functions at...
In literature multi-class SVM is constructed using one against all, one against one and decision tree based SVM using Euclidean and Mahalanobis distance. To maintain high generalization ability, the most separable classes should be separated at the upper nodes of decision tree. Among statistical measures information gain, gini index and chi-square are few commonly used class separability measures...
A novel classification method of video shot genre based on data-mining has been proposed. Shot boundary detection and key frames extraction are firstly performed. Then, some visual features such as color and motion are extracted for the key frame and shots. Furthermore, decision tree is applied to discover the rules between these features and shots genres from numerous training data. These rules are...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.