The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Data Clustering in Data Mining is a domain which never gets out of focus. Clustering a data was always an easy task but achieving the required accuracy, precision and performance was never so easy. K means being an archaic clustering algorithm got tested and experimented thousands of times with variety of datasets and other combination of algorithm due to its robustness and simplicity but what this...
This paper studies design and implementation of precision marketing system on business platform of telecom network package, and proposes the analysis and mining technology based on distributed processing technology, for massive business payment data. Then the mining results are applied to final scheme of precision marketing strategy implementation. The scheme uses K-Means to segment users-based business...
Data analysis plays an indispensable role in the knowledge discovery process of extracting of interesting patterns or knowledge for understanding various phenomena or wide applications. Visual Data Mining is further presenting implicit but useful knowledge from large data sets using visualization techniques, to create visual images which aid in the understanding of complex, often massive representations...
The key technology to analyzing electricity data is cluster methods, of which the traditional way has already lost its agility and quality due to the increasing data volume. To this end, this paper presented an electricity data mining structure: first the higher dimensional data should be reduced to lower ones, second the reduced-dimensional results should be classified into typical usage behavior...
In this digital world, we are facing the flood of data, but depriving for knowledge. The eminent need of mining is useful to extract the hidden pattern from the wide availability of vast amount of data. Clustering is one such useful mining tool to handle this unfavorable situation by carrying out crucial steps refers as cluster analysis. It is the process of a grouping of patterns into clusters based...
Micro array data play a vital role in simultaneously monitoring the expression profile of large number of genes that are specified with various experimental conditions. In bioinformatics research, the recognition of co-expressed and coherent patterns is a major objective in micro array data analysis. The K-means clustering algorithm is gaining popularity in the knowledge discovery domain for effectively...
Compared with conventional travel data such as GPS data, detector data and float car data, call detail record data from the cell phone communication not only cost low but also has a large scale which demonstrate it is the best way to collect travel information for studying macroscopic travel activities. This paper presents a complete method to discover hot path and travel feature in a traffic network...
Frequent itemset mining is a fundamental step in analysis of big data where correlation among the raw data in deemed necessary. In modern era the amount of data available for processing has grown exponentially, making it a stepper task for mining algorithms to provide solution in a timely manner. The software implementations are normally not efficient in handling such datasets thus focus on parallel...
Nowadays, many organizations collect large volumes of event log data on a daily basis, and the analysis of collected data is a challenging task. For this purpose, data mining methods have been suggested in past research papers, and several data clustering algorithms have been developed formining line patterns from event logs. In this paper, we introduce an open-source tool called LogClusterC which...
Today's large-scale supercomputers are producing a huge amount of log data. Exploring various potential correlations of fatal events is crucial for understanding their causality and improving the working efficiency for system administrators. To this end, we developed a toolkit, named LogAider, that can reveal three types of potential correlations: across-field, spatial, and temporal. Across-field...
The article presents the analysis of the clustering problem formalization and considers possibilities to use the classical methods and bio-inspired methods for solving problems of the cluster analysis. In this paper we do not present full review of the new clustering methods, but identify some trends in the development of cluster analysis and special attention is given to area of bio-inspired methods...
Modern big data platforms such as Apache Hadoop and Apache Spark are able to process and analyse huge data sets, but still lack comprehensive support for spatial data analysis. Nevertheless, spatial data mining requires an efficient distributed processing of big spatial data. Spatial data mining is a subclass of data mining, which mainly focuses on obtaining explicit knowledge, spatial relations and...
Multi-View Clustering models can be viewed as a way to extract information from different data representations to improve the clustering accuracy. In multi-view clustering, some views are irrelevant and among the relevant ones, some may be more or less relevant than others. This is why the most part of existing algorithms assign a weight to each view aiming to compute its relevance in the clustering...
Social networks are usually analyzed and mined without taking into account the presence of missing values. In this article, we consider dynamic networks represented by sequences of graphs that change over time and we study the robustness and the accuracy of the community detection algorithms in presence of missing edges. We assume that the network evolution can provide a complementary information...
This paper introduces a new topological clustering approach to cluster high dimensional datasets based on t-SNE (Stochastic Neighbor Embedding) dimensionality reduction method and spectral clustering. Spectral clustering method needs to construct an adjacency matrix and calculate the eigen-decomposition of the corresponding Laplacian matrix [1] which are computational expensive and is not easy to...
With the development of Internet technology and the arrival of the era of big data, it is necessary to analyze and excavate the micro video data. It can help micro video creators to create better to analysis micro video data. This paper mainly introduces the structure design, key technical points and specific implementation steps of the micro video topic recommendation system.
Water quality assessment and prediction of Lake Michigan are becoming major challenges in Northwest Indiana, USA. Traditionally, mechanistic simulation models are employed for water quality modeling and prediction. However, given the complicate nature of Lake Michigan in Northwestern Indiana, the detailed simulation model is extremely simple in comparison and, at some point, additional detail exceeds...
The fast development of wireless sensor networks has made a chance to accumulate and remove enormous measure of data from Wireless Sensor Networks. WSN is efficient instrument that empowers its clients to nearly screen, comprehend and control application handle. WSN consist of huge number of heterogeneous sensor hub spread over the extensive territory and help for wireless sensing and data processing...
Classification is a central problem in the fields of data mining and machine learning. Using a training set of labeled instances, the task is to build a model (classifier) that can be used to predict the class of new unlabelled instances. Data preparation is crucial to the data mining process, and its focus is to improve the fitness of the training data for the learning algorithms to produce more...
Stream mining is a trending field of research in this digital age. With the increase in number of users of digital technologies, data is generating exponentially and so is the need to analyse it. This data is very huge in size and cannot be kept stored for a long time, so it must be processed as soon as possible to make space for newly arriving data & to achieve this different single scan algorithms...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.