The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering analysis has the very broad applications on data analysis, such as data mining, machine learning, and information retrieval. In practice, most of clustering algorithms suffer from the effects of noises, different densities and shapes, cluster overlaps, etc. To solve the problems, in this paper, we propose a simple but effective density-based clustering framework (DCF) and implement a clustering...
In this work a decision support system (DSS) for the conversion of Unified Parkinson's Disease Rating Scale (UPDRS) motor symptoms into a Hoehn & Yahr stage representation is proposed. Accurate estimation of a Parkinson's Disease patient's Hoehn & Yahr stage is of great importance since this single value is enough to represent condition, severity of symptoms and localization and disease progression...
This paper presents an exploratory data analysis to evaluate how spatial information can be used to extract homogeneous regions in hyperspectral images. The basic assumption on the linear unmixing model applied to hyperspectral images, is that the pixels are in the convex hull of the cone with the endmembers at its vertices. Several spectral unmixing algorithms look for a single convex region to depict...
Classification is the one of the most important techniques in Datamining for data analysis. In Datamining, different Classification Techniques are available to predict outcome for a given dataset. There are many classification techniques for predicting and estimating accuracy, one such famous technique is Naïve Bayes Classifier. Naïve Bayes is very popular as it is easy to build, not so complex and...
Future advanced driver assistant systems and automated driving put high demands on the environmental perception especially in urban environments. Major tasks are the perception of static and dynamic elements along with the drivable area and road structures. Although these tasks can be done without an explicit representation of the ground surface, evaluations with real-world sensor data have shown...
User interaction with web sites generates a large amount of web access data stored in the web access logs. Those data can be used for e-commerce to conduct an evaluation of possessed website pages as one of the efforts to understand the desires of the user. Through classification techniques in web usage mining, we conducted an experiment to categorize a number of data obtained from the client log...
Most of the existing Data Mining algorithms have been manually produced, that is, have been developed by a human programmer. A prominent Artificial Intelligence research area is automatic programming - the generation of a computer program by another computer program. Clustering is an important data mining task with many useful real-world applications. Particularly, the class of clustering algorithms...
Data mining algorithms are applied on the small volumes of available data from recorded signals of UWB Bistatic maritime FSR system for more precise target classification. The rough estimation (pre-classification) of the length, reflected energy of the target is received from signal records with an original structure of a CFAR processor, for target detection and estimation in the time domains. The...
Feature selection plays an important role in the area of machine learning. Class Label is often used as the supervised information for supervised feature selection algorithm while constraints are rarely used. So, an effective feature selection algorithm with pairwise constraints called Constraints Score was proposed. But its performance still is limited by neglecting the correlation between features...
Software defect(bug) repositories are great source of knowledge. Data mining can be applied on these repositories to explore useful interesting patterns. Complexity of a bug helps the development team to plan future software build and releases. In this paper a prediction model is proposed to predict the bug's complexity. The proposed technique is a three step method. In the first step, fix duration...
Missing data is a well-recognized issue in data mining, and imputation is one way to handle the problem. In this paper, we propose a novel tree-based imputation algorithm called ??imputation tree?? (ITree). It first studies the predictability of missingness using all observations by constructing a binary classification tree called ??missing pattern tree?? (MPT). Then, missing values in each cluster...
One of the core technologies in smart antenna (SA) is DOA estimation. The current DOA estimation methods can be classified into three basic categories: spectrum searching algorithms, subspace algorithms and algorithms for best performance. All of these three categories have some limitations and can not be applied in CDMA system directly. This paper proposes a new simple and practical method for DOA...
In this paper, a new decision tree construction algorithm (MIDT) is proposed. MIDT (Multiple Informative Decision Tree) uses principal component analysis to integrate information gain, samples distribution information and correlation coefficient as the basis of the selection of splitting attributes. This method can overcome the disadvantage of ID3 decision tree construction method that uses information...
Naive Bayes Classifiers have been known with the advantages of high efficiency and good classification accuracy and they have been widely used in many domains. However, the classifiers need complete data. And the phenomenon of missing data widely exists in practice. Facing this instance, learning naive Bayes classifier and classification method with missing data are built in this paper. Compared with...
In this paper, we proposed two novel methods used to distinguish the singer of a pop music. We focused on a single singer and single track case. These two methods are ldquoPitch Extractionrdquo method and ldquo1/12 OFCCrdquo method. The Pitch Extraction method is composed of three stages and they are Singing pitch estimation stage, Exact pitch calculation stage and GMM classification stage. ldquo1/12...
This paper describes an application of the fuzzy linear regression analysis for land-cover classification of Landsat TM data. The reflectance of spectral bands for each land-cover class is considered as fuzzy numbers, and a fuzzy linear regression identification model and a fuzzy linear regression estimation model are established and their application results are discussed. The proposed method has...
The area under the Receiver Operating Characteristic curve (AUC) has been successfully applied to binary-class tasks. However, its extension to multi-class problems has become a difficult task due to some practical issues. Up to now, its generalization work is relatively little and is not considerably ideal. In this paper, a new method is presented to estimate AUC for multi-class problems, which not...
Aimed to solve the problem of low classification accuracy caused by poor distribution estimation by training naive Bayes document classifier on word clusters, we build a sequential word list based on mutual information between words and their semantic cluster labels, then construct a sample set of the same size with the word list through bootstrap sampling and use the average of the corresponding...
This paper proposes a novel feature ranking method, DensityRank, based on kernel estimation on the feature spaces to improve the classification performance. As the availability of raw data in many of today's applications continues to grow at an explosive rate, it is critical to assess the learning capabilities of different features and select the important subset of features to improve learning accuracy...
Atlas-based segmentation is a well-known method of automatically computing segmentation. When multiple atlases are available, then each atlas can be used to compute a 'label', which is an estimation of the ground truth segmentation of a target image. By combining these labels, a more accurate approximation of the ground truth segmentation can be made. A common method to combine labels is the STAPLE...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.