The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Distributed data mining techniques and mainly distributed clustering are widely used in the last decade because they deal with very large and heterogeneous datasets which cannot be gathered centrally. Current distributed clustering approaches are normally generating global models by aggregating local results that are obtained on each site. While this approach mines the datasets on their locations...
Owing to the factors of cost and time limit, the number of samples is usually small in the early stages of manufacturing systems. When the number of available data is small, traditional statistic techniques have difficulty to obtain robust analyses. Therefore, based on a uni-modality distribution assumption, many researchers have proposed virtual sample generation methods to expand the training sample...
In recent years, special interest has been paid to the solution of sector design problem. The airspace is partitioned into sectors, each of them being controlled by a group of controllers. Airspace sectors should be designed cautiously, ensuring that no sector would be overloaded during the day. The objective of an airspace design process is to adapt the airspace according to the evolution of the...
Smart Grids technology is emphasized a lot in the future power system worldwide. Nowadays, the widely used Automatic Meter Reading (AMR) technology in Finland makes it possible to collect customers' hourly load measurements and to use data analysis methods for customer clustering and load prediction purposes. This paper addresses the detection of possible changes in customers' behavior. This could...
Many popular web service networks are content-rich in terms of heterogeneous types of entities and links, associated with incomplete attributes. Clustering such heterogeneous service networks demands new clustering techniques that can handle two heterogeneity challenges: (1) multiple types of entities co-exist in the same service network with multiple attributes, and (2) links between entities have...
In this paper we demonstrate a new density based clustering technique, CODSAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical...
This paper focuses on detecting and classifying pole-like objects from point clouds obtained in urban areas. To achieve our goal, we propose a system consisting of three stages: localization, segmentation and classification. The localization algorithm based on slicing, clustering, pole seed generation and bucket augmentation takes advantage of the unique characteristics of pole-like objects and avoids...
Data mining is a field of computer science and information technology that deals with the discovery of hidden patterns or interesting patterns in a large or a complex database. As the dimensions of database is growing rapidly, it is necessary to analyze the huge amount of information. Nowadays, there are many applications that are generating streaming data i.e. a sequence of objects that are arriving...
Clustering is used data mining technique in which a group of similar objects is combined together to form clusters, these clusters are different from the objects in another clusters. This paper describes some clusterization techniques like, partitional technique, hierarchical technique, grid-based technique, density-based technique and their algorithms. Partitional method divides the data set into...
In this paper, a Fuzzy clustering method based on Fuzzy c-Means clustering(FCM) and Evolutionary Strategies (ES) is proposed for handwritten Farsi character recognition. Experimental result showed that not only this algorithm can represent accurately the ambiguity of handwritten characters but also it outperforms the classical crisp based methods especially when word recognition is the main concern.
We present an efficient method able to extract a shadow model from a scene, exploiting the HLS color components. The algorithm allows to recover target shapes in diurnal scene for improved identification. It is based on the realization of a General Bitmap Model and a more particular Strip Bitmap Model to identify shadow regions. Each pixel in the image is classified as shadow or not by a minimum distance...
Segmentation is a process of partitioning the image into several objects. It plays a vital role in many fields such as satellite, remote sensing, object identification, face tracking and most importantly medical applications. Here in this paper, we here supposed to propose a novel image segmentation using iterative partitioning mean shift clustering algorithm, which overcomes the drawbacks of conventional...
In this paper, we propose a factor weighted fuzzy c-means clustering algorithm. Based on the inverse of a covariance factor, which assesses the collinearity between the centers and samples, this factor takes also into account the compactness of the samples within clusters. The proposed clustering algorithm allows to classify spherical and non-spherical structural clusters, contrary to classical fuzzy...
Constrained Random Verification (CRV) is becoming the mainstream methodology for the functional verification of complex System on Chip (SoC) designs. In CRV, constraint satisfaction problem (CSP) solvers are used to generate the input stimulus required for verification. In order to achieve the verification closure, CRV tools have to produce multiple different solutions, distributed uniformly in the...
Technological advancement has enabled us to store and process huge amount of data items in a relatively much lesser span of time. The term "Big Data" simply refers to huge amount of data nowadays used frequently in industrial and research circles. The focus point here is not just the collection of data but careful analysis of the collected data so that meaningful results can be obtained...
Associative memory is one of the significant and effective functions in communication. Conventionally, several types of artificial associative memory models have been de-veloped. In the field of psychology, it is known that human memory and emotions are closely related each other, such as the mood-congruency effects. In addition, emotions are sensitive to sympathy for facial expressions of communication...
The connected load of consumers is known to the distribution utility but the usage pattern of them is not known without smart meters installed on the site. Furthermore, constituents of the feeders at primary distribution level are also unknown. In partially deregulated developing countries implementation of Time of Use tariff becomes a challenging task. This paper addresses this crucial issue where...
Whole brain tractography generates a very huge dataset composed by various tracts of different shapes, lengths, positions. Then clustering them into anatomically meaningful bundles is a challenge. Until now, several clustering methods have been proposed such as methods based on similarity measures or methods based on anatomical information, but no optimal clustering criteria were found yet. All methods...
This paper presents an approach for classification which is based on the neighborhood expansion. The proposed algorithm can (1) find automatically the number of clusters, and (2) classify irregular data set. In the approach, we first defined the distance between a point and a set, then the neighborhood of a data set. The algorithm can begin with any point in the data set and expands the point to a...
In the light of the one-sidedness of commonly used algorithms of power load classification caused by single similarity function, and the defects of these algorithm which have special requirements to the data space distribution and are easy to fall into local optimal solution, proposes a new electric power load classification algorithm. The algorithm first proposed a dual-scale similarity function...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.