Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
In this paper, we propose a novel model for unsupervised Chinese event extraction. We use a multi-information fusion technique to combine two kinds of information for knowledge representation of event instances: language features and structure information. Then, we perform our proposed XLS-means Clustering Algorithm to group the candidate event instances into a "natural" number of clusters,...
Abstract-Scientific and technical literature is a useful resource where people can extract interesting knowledge or patterns by text mining tools. Text mining technologies have been widely used to reveal topics and the structure of topics. In this paper, the selected articles in the form of textual data are represented by the network structure at first, and then text clustering algorithm is applied...
This paper studies the problem of extracting data from large numbers of semi-structured web pages. The fact that many websites have enormous pages generated dynamically from a underlying structured source like a database makes it feasible to induct a common template for similar web pages and then extract data accordingly. Previous work on this problem has limited practical utility because of either...
Wireless Sensor Networks (WSNs) consist of small nodes with sensing, computation, and wireless communication capabilities. Wireless Sensor Network (WSN) is a promising data mining solution for precision agriculture. Instrumented with wireless sensors, it will become available to monitor the plants for real time, such as air temperature, soil water content, and nutrition stress. This real time information...
K-Means algorithm is one of the most used clustering algorithm for Knowledge Discovery in Data Mining. Seed based K-Means is the integration of a small set of labeled data (called seeds) to the K-Means algorithm to improve its performances and overcome its sensitivity to initial centers. These centers are, most of the time, generated at random or they are assumed to be available for each cluster....
There is an emerging focus on real-time data stream analysis on mobile devices. A wide range of data stream processing applications are targeted to run on mobile handheld devices with limited computational capabilities such as patient monitoring, driver monitoring, providing real-time analysis and visualization for emergency and disaster management, real-time optimization for courier pick-up and delivery...
IDS (Intrusion Detection system) is an active and driving defense technology. This paper mainly focuses on intrusion detection based on data mining. The aim is to improve the detection rate and decrease the false alarm rate, and the main research method is clustering analysis. The algorithm and model of ID are proposed and corresponding simulation experiments are presented. Firstly, a method to reduce...
The partition clustering problem is the one of the hardest problems in nowadays research. Also Partition clustering problem is the basic problem in data mining. In most cases, the partition clustering algorithm is NP problem. In this paper, there introduced an algorithm to solve partition clustering problem in polynomial time by using some reduction technology. This algorithm is availability after...
Presently, in the data mining scenario clustering of large dataset is one of the very important techniques widely applied to many applications including social network analysis. Applying more specific pre-processing method to prepare the data for clustering algorithms is considered to be a significant step for generating meaningful segments. In this paper we propose an innovative clustering technique...
Despites the great interest caused by social networks in Business Science, their analysis is rarely performed both in a global and systematic way in this field: most authors focus on parts of the studied network, or on a few nodes considered individually. This could be explained by the fact that practical extraction of social networks is a difficult and costly task, since the specific relational data...
Spectral clustering is a data mining method used for finding patterns in high dimensional datasets. It has been applied effectively to solve many problems in signal processing, bioinformatics, etc. In this paper spectral clustering was implemented to find students' patterns of behavior in an elearning system, to explore the relationship between the similarity of students'behavior and their academic...
This paper proposes a new method to cluster law texts based on referential relation of laws. We extract law entities (an entity represents a law) and their referential relation from law texts. Then SimRank algorithm is applied to calculate law entity's similarity through referential relation and law clustering is carried out based on the SimRank similarity. This is the first time to apply SimRank...
Currently, a large number of clustering algorithms are available for data mining. But it will be difficult for people who to a large extent know little about data mining to select an appropriate clustering algorithm. In order to solve this problem, in this paper, we first comprehensively analyze a number of clustering algorithms, then summarize their evaluation criteria and apply the so-called fuzzy...
Cluster analysis becoming increasingly essential in data mining field, and is mainly used to discover the valuable data distribution and data mode in the potential datum. Based on the pheromone studies on basic clustering model, the theory of information entropy and two classical clustering analysis algorithms, an algorithm of K-means based on the pheromone is presented firstly. The algorithm works...
In the field of data mining, clustering is one of the important methods. K-Means is a typical distance-based clustering algorithm; 2-tier clustering should implement scalable clustering by means of dividing, sampling and knowledge integrating. Among those tools of distributed processing, Map-Reduce has been widely embraced by both academia and industry. Hadoop is an open-source parallel and distributed...
It is an important issue to detect the intrusion attacks for the security of network communication. The clustering-based methods usually are proposed to cope with the problem of intrusion detections. However, how to detect the unknown intrusion attacks within stream data has come to be a challenge. In this paper, we consider the intrusion attacks as outliers and propose a novel approach (called DOExMiCluster)...
A robust mixture model-based clustering algorithm using genetic techniques is proposed in this paper. In many engineering and application domains, noisy samples and outliers often exist in data collections, causing negative effects on performance of data mining methods if they are not made aware of these elements. Classical probabilistic mixture-based clustering is one known to be very sensitive to...
A possibilistic fuzzy c-means (PFCM)[1] has been proposed for clustering unlabeled data. It is a hybridization of possibilistic c-means (PCM) and fuzzy c-means (FCM), therefore it has been shown that PFCM is able to solve the noise sensitivity issue in FCM, and at the same time it helps to avoid coincident clusters problem in PCM with some numerical examples in low-dimensional data sets. In this paper,...
In this paper we introduce a new algorithm for multi-focus image fusion based on edge information of the source images and K-mean segmentation. The basic idea is to extract the edge information of the source images, divide the images into blocks and then select the blocks with higher edge information to construct the resultant fused image. The pixels in the unallocated blocks of the fused image are...
The anticipated uptake of Cloud computing, built on the well-established research fields of Web services, networks, utility computing, distributed computing and virtualisation, will bring many advantages in cost, flexibility and availability for service users. These benefits are expected to further drive the demand for cloud services, increasing both the cloud customer base and the scale of cloud...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.