The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Data Mining is all about data analysis techniques. It is useful for extracting hidden and interesting patterns from large datasets. Clustering techniques are important when it comes to extracting knowledge from large amount of spatial data collected from various applications including GIS, satellite images, X-ray crystallography, remote sensing and environmental assessment and planning etc. To extract...
Improvement in detection and evaluation of brain tumour is an important task in medical field. MRI is a technology which enables the detection, diagnosis and evaluation. An automatic detection requires pre-processed image. Preprocessing makes the image segmentation more accurate. In preprocessing the noise removal, enhancement of image, artifact removal and skull stripping are carried out. Noise can...
Most of the clustering algorithms are affected by the number of attributes and instances with respect to the computation time. Thus, the data mining community has made efforts to enable induction of the clustering efficient. Hence, scalability is naturally a critical issue that the data mining community faces. A method to handle this issue is to use a subset of all instances. This paper suggests an...
In this paper, we present a novel modified Fuzzy C-means algorithm with symmetry information to reduce the effect of noise in brain tissue segmentation in magnetic resonance image (MRI). We integrate brain's bilateral symmetry into the conventional Fuzzy C-means (FCM) as an additional term. In experiments, some synthetic images, and both simulated and real brain images were used to investigate the...
A new shadowed c-means clustering based image segmentation method is proposed in this paper. By including the local spatial information in shadowed c-means algorithm and mapping the original data into a high dimensional space via kernel method, we propose the Kernel Spatial Shadowed C-Means (KSSCM) clustering algorithm for image segmentation problems. The KSSCM based approach shows better performance...
Nowadays, organizations are facing several challenges when they try to analyze generated data with the aim of extracting useful information. This analytical capacity needs to be enhanced with tools capable of dealing with big data sets without making the analytical process a difficult task. Clustering is usually used, as this technique does not require any prior knowledge about the data. However,...
Graph-based manifold learning techniques have become of paramount importance when researchers have been faced to nonlinear data. These techniques have allowed them to discover relations that usual approaches such as PCA and MDS were incapable of. However, properties such as non-uniform sampling, varied topological substructures and highly curved manifolds still represent a challenge to these methods...
The class imbalance problem is a well-known classification challenge in machine learning that has vexed researchers for over a decade. Under-representation of one or more of the target classes (minority class(es)) as compared to others (majority class(es)) can restrict the application of conventional classifiers directly on the data. In addition, emerging challenges such as overlapping classes, make...
Many real time applications, they are generated continues flow of data streams have became more popular now a days. Therefore many researches attracted clustering data streams. Most of data stream clustering algorithms based on distance function which find out clusters with spiracle of shape clusters and unable to deal noisy data. Therefore density based clustering algorithms substitute remarkable...
Spatio-temporal clustering is a sub field of data mining that is increasingly gaining more scientific attention due to the advances of location-based or environmental devices that register position, time and, in some cases, other semantic attributes. This process pretends to group objects based in their spatial and temporal similarity helping to discover interesting patterns and correlations in large...
Several clustering algorithms have been extensively used to analyze vast amounts of spatial data. One of these algorithms is the SNN (Shared Nearest Neighbor), a density-based algorithm, which has several advantages when analyzing this type of data due to its ability of identifying clusters of different shapes, sizes and densities, as well as the capability to deal with noise. Having into account...
Density-based clustering can detect arbitrary shape clusters, handle outliers and do not need the number of clusters in advance. However, they cannot work properly in multi density environments. The existing multi density clustering algorithms have some problems in order to be applicable for data streams such as the need of whole data to perform clustering, two-pass clustering and high execution time...
Clustering is an important tool which has seen an explosive growth in Machine Learning Algorithms. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering algorithm is one of the most primary methods for clustering in data mining. DBSCAN has ability to find the clusters of variable sizes and shapes and it will also detect the noise. The two important parameters Epsilon (Eps)...
The density-based clustering algorithm DBSCAN is a fundamental technique for data clustering with many attractive properties and applications. However, DBSCAN requires specifying all pair wise (dis)similarities among objects that can be non-trivial to obtain in many applications. To tackle this problem, in this paper, we propose a novel active density-based clustering algorithm, named Act-DBSCAN,...
The analysis of high dimensional data comes with many intrinsic challenges. In particular, cluster structures become increasingly hard to detect when the data includes dimensions irrelevant to the individual clusters. With increasing dimensionality, distances between pairs of objects become very similar, and hence, meaningless for knowledge discovery. In this paper we propose Cartification, a new...
The popularity of internet usage greatly motivates the online advertising activities. Compared to advertising on traditional media, online advertising has rich information as well as necessary techniques to achieve precise user targeting. This rich information includes the search behaviors of a user, such as queries issued, or the ads clicked by the user. For popular websites with large number of...
Image segmentation plays an important role in medical imaging for clinical purposes. In this paper, an image segmentation method using the ensemble of fuzzy clustering is proposed, in which we classify the pixels in an image according to heterogeneous clustering methods, and then combine the clustering results by a KL-Divergence based fuzzy clustering algorithm to provide the final image segmentation...
In this paper, we proposed a new approach for image clustering to address the adverse effects of noise presented in the images. In particular, the concept of information gain has been incorporated into classical fuzzy c-means (FCM) algorithm in order to develop a robust clustering method. FCM is associated with high sensitivity to noise and produces non-homogenous clustering. To induce robustness...
In order to address the problem that the pedestrian segmentation in infrared image is easy to be interfered by the human pose and noise, this paper presents a pedestrian segmentation algorithm in infrared images employing super pixel and conditional random filed. Owing to accelerate the computation, the algorithm employs the simple linear iterative clustering algorithm to divide the image into some...
Subspace clustering via Low-Rank Representation (LRR) has shown its effectiveness in clustering the data points sampled from a union of multiple subspaces. In original LRR, the noise in data is assumed to be Gaussian or sparse, which may be inappropriate in real-world scenarios, especially when the data is densely corrupted. In this paper, we aim to improve the robustness of LRR in the presence of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.