The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a novel Adaboost.R training algorithm by weight trimming, which increases the training speed when dealing with large datasets and retain the forecast precision. At each iteration, the algorithm discards most of the samples with small weight and keeps only the samples whit large weight to train the weak learner. During training, only a small portion of the samples are used to train...
Rough set theory has been successfully applied to many areas including machine learning pattern recognition decision analysis process control, knowledge discovery from databases. An algorithm in finding minimal reduction based on prepositional satisfiability (abbreviated as SAT) algorithm is proposed. A branch and bound algorithm is presented to solve the proposed SAT problem. The experimental result...
Nowadays, online word-of-mouth has turned to be a very important resource for electronic businesses. How to analyze user generated reviews and to classify them into different sentiment classes is gradually becoming a question that people pay close attention to. In this field, special challenges are associated with the mining of traveler reviews. At present, there is some research on sentiment analysis...
Community detection is one of the key problems in the field of complex network analysis. In the paper, we mainly focus on the two-part division problem for network, i.e. community (or graph) partitioning. Based on the in-depth analysis on the partitioning results, a two-stage heuristic algorithm named SPC is proposed. It firstly identifies two pseudo-centers, and then generates two semi-communities...
Image segmentation is a key process of any image recognition system. Loose repeat algorithm and K-means algorithm are used to resolve the stomach epidermis tumor segmentation, but many conglutinated cells cannot be separated by others with the help of traditional segmentation algorithms. So the Vincent watershed algorithm as well as the Inver watershed algorithm is designed to do segmentation experiments...
Based on discussing in the alternative covering neural networks (ACNN), the integrated algorithm are proposed based on rough set (RS) theory and ACNN. RS is applied to reduce and process the original data. While ensuring the integrity of information, the data dimension is reduced. ACNN is used to design multi-layer forward network. Through using RS to reduce data dimension, the calculation of ACNN...
In view of the complexity, ambiguity and multi-factor of teaching evaluation, it is an important topic that classification is applied to the domain. In the field of classification, incomplete data is one of thorny issues. There have been many methods to deal with the problem in many studies. In the paper, the MVCM is introduced particularly. It fills these observations with incomplete data by constructing...
Document classification has received extensive attention in the past few decades due to its wide applications in many fields. To efficiently deal with this problem, a novel document classification algorithm based on information bottleneck (IB) and least square version of SVM (LS-SVM) is proposed in this paper. Extensive experimental results on the real-word document corpus show that the proposed algorithm...
Customer resource is the lifeline of enterprise development, and the classification management and service for customer are very important in customer relationship management (CRM). Decision tree is one of the most popular methods for classification. In this paper, we build a decision tree based on C4.5 algorithm for coal logistics customer analysis, adopt Pessimistic Error Pruning (PEP) to simplify...
Face recognition is one of the most challenging research topics in the field of pattern recognition and computer vision. To efficiently deal with this problem, a novel face recognition algorithm is proposed by using marginal manifold learning and SVM classifier. Extensive experiments show that the proposed algorithm performs much better than other well-known face recognition algorithms.
Support vector machine for pattern classification is motivated by linear machines, but rely on preprocessing the data to represent in a high dimension with an appropriate nonlinear mapping, data from two categories can by separated by a hyperplane. To make certain the hyperplane, the key problem is selecting appropriate criterion and algorithm. To find out the appropriate solution vector in solution...
Information Fusion is a valid way which can decrease the uncertainty of making decision, and is also a hotspot. The paper makes some work on a important problem about Fuzzy Integral, that is how to get the Fuzzy Density, and compares two typical means. Based on 11 UCI data set, this paper conducts the compared experiment of several Information Fusion methods. It is compared with references 4 and 5...
Aiming at the problem of higher false positive and missing report rate in network intrusion detection, an intrusion detection method based on clustering algorithm is proposed in this paper. This method applies Fuzzy C-means clustering Algorithm to the detection of network intrusion. Through the building of intrusion detection model, carries out fuzzy partition and the clustering of data, and this...
The data mining field proposes the development of methods and techniques for assigning useful meanings for data stored in databases. It gathers researches from many study fields like machine learning, pattern recognition, databases, statistics, artificial intelligence, knowledge acquisition for expert systems, data visualization and grids. Data mining represents a set of specific algorithms of finding...
Proposed a method of detecting intrusion using incremental SVM based on key feature selection. A center SVM summarizes the distributed samples and incorporates them to build the incremental SVM for locals. By eliminating the redulldant features of sample dataset the space dimension of the sample data is reduced. Using this method it can overcome the shortages of SVM-time-consuming of training and...
A new clustering algorithm is proposed based on particle swarm optimization (PSO). The main idea of the new algorithm is to solve clustering problem using the fast search ability of the particle swarm optimization, each particle is composed of a cluster center vector, and represents a possible solution of the clustering problem. To escape from local optimum, a new idea is proposed, that is the neighborhood...
Due to the rapid technological developments in image/video capturing, huge data storage, video compression and networking, huge amount of video data are produced each day all over the world. Finding effective ways to store, index and retrieve these video remains a hot researching area. It is especially important for the producers/editors of television programs, since to keep track of the 1000's of...
The performance of speaker identification systems has improved due to recent advances in speech processing techniques but there is still need of improvement in term of text-independent speaker identification and suitable modelling techniques for voice feature vectors. It becomes difficult for person to recognize a voice when an uncontrollable noise adds in to it. In this paper, feature vectors from...
Combining multiple classifiers combination, sampling techniques, and more appropriate evaluation metrics, we first compare the selection of multiple classifiers combination based on GMDH(S-GMDH) and other classification methods on nine class imbalance data sets; we analyze the change of classification performances with and without using sampling. Then we further do customer churn prediction on `churn'...
Recognition of prostate calculus is an important step to determine the source of pathological organ, and is of great importance for further diagnosis of prostate cancer. In this paper, due to some tissues are similar to calculus, and prostate calculus usually adheres to other tissues, a recognition algorithm for prostate calculus based on transition region and PCA-SVM is proposed. Firstly, local entropy,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.