The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In 2010, we proposed the improved unsupervised possibilistic clustering algorithm (IUPC) that can be run as an unsupervised clustering and overcome the weakness of the unsupervised possibilistic clustering algorithm (UPC) that it tends to generate coincident clusters. IUPC inherits the merits of UPC. In the meanwhile, IUPC solves the coincident clusters problem of UPC by limiting the feasible regions...
With the advent of modern techniques for scientific data collection, large quantities of data are getting accumulated at various databases. Systematic data analysis methods are necessary to extract useful information from rapidly growing data banks. Cluster analysis is one of the major data mining methods and the k-means clustering algorithm is widely used for many practical applications. But the...
In this paper a clustering algorithm has been presented for data sets having faces with large variations in pose. Disjoint clusters are created from low-dimensional subspaces of the data set. Partitioning is carried out in the form of a tree-like structure. The subspace-based linear recognition algorithm, Subclass Linear Discriminant Analysis (SLDA) has been employed for recognizing the faces. The...
A high rate of expression of Endothelin protein in the placental cell is very much regulated by inhalation of tobacco smoke and leads to placental abnormalities subjected to birth failure. Our application developed using Image Processing, Nearest Neighbor algorithm (NN) and Genetic Algorithms (GA), automates the study of these proteins to assist pathologists and lab technicians in achieving a more...
Fuzzy clustering model is an essential tool to find the proper cluster structure of given data sets in pattern and image classification. In this paper, a new weighted fuzzy C-Means (NW-FCM) algorithm is proposed to improve the performance of both FCM and FWCM models for high-dimensional multiclass pattern recognition problems. The methodology used in NW-FCM is the concept of weighted mean from the...
Recently, liquid chromatography coupled to mass spectrometry (LC-MS) has become a standard technique for identifying differential abundance of peaks as biomarkers. Two major problems in the preprocessing of LC-MS data analysis are how to adjust and align multiple LC-MS datasets efficiently and correctly. Hence, an effective algorithm is needed to adjust the variation in retention time and align protein...
Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data". Clustering is the automated search for group of related observations in a data set. The K-Means method is one of the most commonly used clustering techniques for a variety of applications. This paper proposes a method for making the K-Means algorithm...
Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. Within a gene expression matrix there are usually several particular macroscopic phenotypes of samples. Selection of genes most relevant and informative for certain phenotypes is an important aspect in gene expression analysis. Currently most of the research...
VDBSCAN is very famous Density based clustering algorithm. Handling highly dense data point is a challenging task in clustering. VDBSCAN algorithm handles widely varied density data points well and also over comes the problem of noise and outlier. But this algorithm is depends on the input parameters Eps and Minpts. The careful selection of these input parameters plays an important role in proper...
This paper formulates, simulates and assess an improved data clustering algorithm for mining web documents with a view to preserving their conceptual similarities and eliminating the problem of speed while increasing accuracy. The improved data clustering algorithm was formulated using the concept of K-means algorithm. Real and artificial datasets were used to test the proposed and existing algorithm...
Experiments are carried out on datasets with different dimensions selected from UCI datasets by using two classical clustering algorithms. The results of the experiments indicate that when the dimensionality of the real dataset is less than or equal to 30, the clustering algorithms based on distance are effective. For high-dimensional datasets--dimensionality is greater than 30, the clustering algorithms...
This paper compares hard and soft updating centroids for clustering Y-STR data. The hard centroids represented by New Fuzzy k-Modes clustering algorithm, whereas the soft centroids represented through k-Population algorithm. These two algorithms are experimented through two datasets, Y-STR haplogroups and Y-STR Surnames. The results show that the soft centroid performance is better than the hard centroid...
This paper proposes the optimal K nearest neighbors (KNN) positioning algorithm via theoretical accuracy criterion (TAC) in wireless LAN (WLAN) indoor environment. As far as we know, although the KNN algorithm is widely utilized as one of the typical distance dependent positioning algorithms, the optimal selection of neighboring reference points (RPs) involved in KNN has not been significantly analyzed...
As a method built upon spectral graph theory, spectral clustering has the advantages of processing data with any spatial shapes and converging on global optimal solutions. But it suffers from the defects that the clustering result is quite sensitive to its parameters and the number of clusters must be prespecified. In this paper, a novel approach which integrates the grey relational analysis based...
In this paper, we integrate symmetric NMF and normalized cut into a single clustering framework and derive the computational algorithm. Another contribution is to provide a new matrix inequality which is useful for the analysis of 4-th order matrix polynomials. We perform experiments on three real-life data sets to show the effectiveness of the proposed algorithm. We also demonstrate the importance...
Magnetic Resonance Imaging (MRI) is one of the best technologies currently being used for diagnosing brain tumor. Brain tumor is diagnosed at advanced stages with the help of the MRI image. Segmentation is an important process to extract suspicious region from complex medical images. Automatic detection of brain tumor through MRI can provide the valuable outlook and accuracy of earlier brain tumor...
Finding similar crime case subsets is an important task for intelligence analysts in crime investigation. It can not only provide multiple clues to solve crimes but also improve efficiency to catch the criminals. However, the conventional approach by querying specific attributes in relational databases has two defects: first, it is relatively of poor efficiency when a lot of incidents have to be handled;...
Traditional Clustering is a powerful technique for revealing the "hot" topics among documents. However, it's hard to discover the new type events coming out gradually. In this paper, we propose a novel model for detecting new clusters from time-streaming documents. It consists of three parts: the cluster definition based on Multi-Representation Index Tree (MI-Tree), the new cluster detecting...
In traditional e-commerce websites, social tags are used in product classification only, and not applied in the domain of personalized recommendation technology. In this paper, we propose a personalized recommendation model based on social tags. We build a user interest model for products by reflecting user interest and product features directly through social tags, and optimize the interest model...
Searching initial centers in high dimensional space is an interesting and important problem which is relevant for the wide various types of K-Means algorithm. However, this is a very difficult problem, due to the"curse of dimensionality"and the inherently sparse data.Algorithm IMSND is one of the latest initialization methods that are based on the idea of sharing neighborhood density. Concerning...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.