The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we consider a special multi-source data clustering problem for which the data-points from the same source cannot be grouped into the same cluster, namely cannot link (CL) constraint, and the sizes of the generated clusters are subject to maximum thresholds. No prior information is given about the level of clutter (namely noisy data) or the number of clusters. Particularly, the clusters...
This paper presents a new sequential clustering algorithm based on sequential hard c-means clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The sequential hard c-means is one of the typical and conventional sequential clustering methods. The proposed new sequential clustering algorithm is based on Dave's noise clustering approach. A characteristic...
Fuzzy clustering techniques, especially Fuzzy C-Means clustering method (FCM), is a popular algorithm widely used in the images segmentation. However, as the conventional FCM doesn't optimize data in feature space and doesn't involve any spatial information, it is sensitive to the noise. In the paper, we presented a novel FCM clustering algorithm based on kernel spatial information to segment the...
In this paper we propose a noise detection system based on similarities between instances. Having a data set with instances that belongs to multiple classes, a noise instance denotes a wrongly classified record. The similarity between different labeled instances is determined computing distances between them using several metrics among the standard ones. In order to ensure that this approach is computational...
Nowadays, Smartphones have been widely used due to their capabilities in communication and multimedia processing. Smartphones provide access to a tremendous amount of sensitive information related to business, such as customer contacts, financial data, and Intranet networks. Hence, the Internet of the future will be mobile Internet. However, threat of malicious software has become an important factor...
The popularity of internet usage greatly motivates the online advertising activities. Compared to advertising on traditional media, online advertising has rich information as well as necessary techniques to achieve precise user targeting. This rich information includes the search behaviors of a user, such as queries issued, or the ads clicked by the user. For popular websites with large number of...
Clustering is one of the most valuable methods of computational intelligence field, in which sets of related objects are cataloged into clusters. Almost all of the well-known clustering algorithms require input number of clusters which is hard to determine but have a significant influence on the clustering result. Furthermore, the majority is not robust enough towards noisy data. In contrast, density...
In this paper, a novel clustering method which combines the advantages of the density-based algorithm for discovering clusters in large spatial databases with noise (DBSCAN) and K-means is proposed. The proposed method can classify the pathological cell and the normal cell to two cluster memberships and the disturbances can also be eliminated from the image. In addition, by morphological image processing...
This paper presents an operator of fuzzy clustering method of image segmentation based on Local Binary Pattern (LBP). Semi-supervised learning and fuzzy clustering method are introduced in order to overcome the problem of initial clustering sensitive. Also, local binary pattern operator is introduced to construct the space feature vectors of pixels, which makes full use of the space characteristics...
For the feature space of high-dimensional data on text clustering contains many redundant features, even "noise" features. The author proposed a feature space correction method, combine with a supervised feature selection methods and K-means clustering method. By analyzing the significance of the features in the clustering process and selecting the features that have more significance, to...
Of the many data clustering algorithms proposed in recent years, the most effective are the density-based clustering algorithms, DBSCAN and IDBSCAN. Although density-based clustering method is effective for identifying graphs, filtering out noise, and obtaining good clustering results, it is extremely time consuming. The IDBSCAN is faster than DBSCAN but is still unsatisfactory. This study therefore...
Clustering is one of the most useful methods of intelligent engineering domain, in which a set of similar objects are categorized into clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, the majority is not robust enough towards noisy data. This paper presents an efficient...
Many clustering techniques have been proposed for the analysis of gene expression data. However, the optimal method for a given experimental dataset is still not resolved. Fuzzy c-means and kernel fuzzy c-means algorithm have been widely applied to gene expression data, but they give the equal weight to the genes and noises, which lead to results that are not stable or accurate. In this paper, we...
In this paper, we propose an algorithm for efficient clustering of gene expression data. The algorithm uses the concept of common neighbors and uses a fuzzy approach for detecting intersecting and overlapping clusters. We have also compared the algorithm to the existing popular approaches and found our algorithm to give good results in terms of z-score measure of cluster validity and p-value measure.
Large document collections containing multiple topics can be overwhelming to understand, requiring librarians and archivists significant time and efforts to develop access points. Efficient computational methods can aid this process by uncovering groups of documents that can be described for access. We investigate the use of density based clustering with document segmentation to identify points of...
The purpose of medical image segmentation is to divides the lesions image with a special meaning and background regional. The characteristics of medical images are generally more complex. There is often overlap between different regions, the edge of the area is vague. Research based on fuzzy clustering method for medical image segmentation, fuzzy C-means clustering and fuzzy kernel clustering methods...
With the increase of network bandwidth and the advance of 3D graphics technology, networked virtual environments (NVEs) have become popular recently. Early SIMNET and currently booming massively multiplayer online games (MMOGs), such as Second Life (SE) and World of War craft (WoW), are examples of NVEs. Because NVE users, interests or habits may be similar, avatars, or the representative of NVE users,...
As fMRI data is high dimensional, applications like connectivity studies, normalization or multivariate analyses, need to reduce data dimension while minimizing the loss of functional information. In our study we use connectivity profiles as a new functional feature to aggregate voxels into clusters. This offers two major advantages in comparison with the current clustering methods. It allows the...
DBSCAN, which is the one of the density-based clustering methods in data mining, does the process of clustering, according to density of data. Although DBSCAN method seems effective in the small data sets, its efficiency in terms of processing time decreases with the growing of data volumes. Because of this reason, DBSCAN as a clustering method is not considered a suitable clustering method for large...
We propose the Multi-resolution Correlation Cluster detection (MrCC), a novel, scalable method to detect correlation clusters able to analyze dimensional data in the range of around 5 to 30 axes. Existing methods typically exhibit super-linear behavior in terms of space or execution time. MrCC employs a novel data structure based on multi-resolution and gains over previous approaches in: (a) it finds...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.