Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Clustering is a classic topic in optimization with k-means being one of the most fundamental such problems. In the absence of any restrictions on the input, the best known algorithm for k-means with a provable guarantee is a simple local search heuristic yielding an approximation guarantee of 9+≥ilon, a ratio that is known to be tight with respect to such methods.We overcome this barrier...
Multidimensional optimization holds a central role in many machine learning problems. When a model quality functional is measured with an almost arbitrary external noise, it makes sense to use randomized optimization techniques. This paper deals with the problem of clustering of a Gaussian mixture model under unknown but bounded disturbances. We introduce a stochastic approximation algorithm with...
Classification of remotely sensed data is an important task for many practical applications. However, it is not always possible to get the ground truth for supervised learning methods. Thus unsupervised methods form a valuable tool in such situations. Such methods are referred to as clustering methods. There exists several strategies for clustering the given data — K-means, density based methods,...
An approach for detection and segmentation of individual buildings on space images and aerial photos is proposed. The approach allows intuitively constructing the system of rules to select objects without prior training, using only simple geometric characteristics of their form.
A key tool to analyze signals defined over a graph is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been proposed, based on the eigen-decomposition of either the graph Laplacian or adjacency matrix. In this paper, we introduce an alternative approach, valid for the general case of directed graphs, that builds the graph Fourier basis as the set of orthonormal vectors...
In this elaborated paper, distinguished two approaches required in order to build an unlabelled data by using an automated feature subset feature selection algorithm: the requirement for seeking the number of groups to conjunct with feature selection (fs), the requirement to normalize the inclination of feature selection (fs) procedure regarding measurements. Here, to investigate a component determination...
MicroRNAs form a family of single strand RNA molecules having length of approximately 22 nucleotides that are present in all animals and plants. Various studies have revealed that microRNA tend to cluster on chromosomes. In this regard, a novel clustering algorithm is presented in this paper, integrating rough hypercuboid approach with fuzzy c-means. Using the concept of rough hypercuboid equivalence...
One of the most popular fuzzy clustering techniques is the fuzzy K-means algorithm (also known as fuzzy-c-means or FCM algorithm). In contrast to the K-means and K-median problem, the underlying fuzzy K-means problem has not been studied from a theoretical point of view. In particular, there are no algorithms with approximation guarantees similar to the famous K-means++ algorithm known for the fuzzy...
Clustering streaming data has gained importance in recent years due to an expanding opportunity to discover knowledge in widely available data streams. As streams are potentially evolving and unbounded sequence of data objects, clustering algorithms capable of performing fast and incremental processing of data points are necessary. This paper presents a method of clustering high-dimensional data streams...
Finding repeated patterns or motifs in a time series is an important unsupervised task that has still a number of open issues, starting by the definition of motif. In this paper, we revise the notion of motif support, characterizing it as the number of patterns or repetitions that define a motif. We then propose GENMOTIF, a genetic algorithm to discover motifs with support which, at the same time,...
Single linkage (SLINK) hierarchical clustering algorithm is a preferred clustering algorithm over traditional partitioning-based clustering as it does not require the number of clusters as input. But, due to its high time complexity and inherent data dependencies, it does not scale well for large datasets. To the best of our knowledge, all existing parallel SLINK algorithms are based on the traditional...
The clustering is the most effective method to identify the outliers in the UCI Repository dataset. This paper proposes detecting outliers on UCI datasets using Adaptive Rough Fuzzy C-Means clustering algorithm. In the first phase of the Adaptive Rough Fuzzy C- Means algorithm, the Rough k means algorithm is used for pre-processing of UCI repository dataset and it is normally identify the outliers...
With increasing data clouds in different geographical areas, the availability of a datacenter and the cost of using the datacenter are two concerned factors of clouds users. The present research aims to present a method using K-means clustering and NSGA-II multi-objective algorithm to maximize availability and minimizes cost in selecting a datacenter. The proposed approach was applied to some real...
An important step in the appearance preservation of real materials is the analysis of how they interact with light. Since this phenomena happens at a microscopic level, heuristics with different complexity have been developed to capture and reproduce it. In order to minimize sampling efforts, one of these approaches consists in representing the reflectance of a material as a linear combination of...
Clustering is an interdisciplinary-studied subject of statistical data analysis. In this study, among various types of clustering algorithms, the algorithms derived from Density Based Spatial Clustering of Applications with Noise (DBSCAN) are investigated. Although DBSCAN is the well-known density-based algorithms it has some bottlenecks. So, enhanced versions of DBSCAN are developed to provide some...
Kriging or Gaussian Process Regression has been successfully applied in many fields. One of the major bottlenecks of Kriging is the complexity in both processing time (cubic) and memory (quadratic) in the number of data points. To overcome these limitations, a variety of approximation algorithms have been proposed. One of these approximation algorithms is Optimally Weighted Cluster Kriging (OWCK)...
Spectral clustering is a powerful approach for clustering, with applications across multiple disciplines, including bioinformatics. However, the way its computational complexity scales limits its application in analyzing large datasets. This complexity can be reduced using the Nyström method, which subsamples the input data in a way that preserves its representational diversity. There are different...
Spectral clustering has shown a superior performance in analyzing the cluster structure. However, the exponentially computational complexity limits its application in analyzing large-scale data. To tackle this problem, many low-rank matrix approximating algorithms are proposed, of which the Nyström method is an approach with proved lower approximate errors. The algorithms commonly combine two powerful...
In this work a new method for data clustering based on principal curves is presented. Principal curves consist of a nonlinear generalization of Principal Component Analysis and may also be regarded as continuous versions of 1-D self-organizing maps. The proposed method divides the principal curves extracted by the k-segments algorithm into two or more curves, according to the number of clusters defined...
Data mining is the method which is useful for extracting useful information and data is extorted, but the classical data mining approaches cannot be directly used for big data due to their absolute complexity. The data that is been formed by numerous scientific applications and incorporated environment has grown rapidly not only in size but also in variety in recent era. The data collected is of very...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.