The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The main goal to extract knowledge in database is to help the user to give semantics of data and to optimize the information research. Unfortunately, this fundamental constraint is not taken into account by almost all the approaches for knowledge discovery. Indeed, these approaches generate a big number of rules that are not easily assimilated by the human brain. In this paper, we propose a new approach...
Clustering analysis method is one of the main analytical methods in data mining, the method of clustering algorithm will influence the clustering results directly. This paper discusses the standard k-means clustering algorithm and analyzes the shortcomings of standard k-means algorithm, such as the k-means clustering algorithm has to calculate the distance between each data object and all cluster...
Clustering documents enable the user to have a good overall view of the information contained in the documents. Most classical clustering algorithms assign each data to exactly one cluster, thus forming a crisp partition of the given data, but fuzzy clustering allows for degrees of membership, to which a data belongs to different clusters. In this system, documents are clustered by using fuzzy c-means...
We address the problem of Kannada character recognition, and propose a recognition mechanism based on k-means clustering. The large dataset of Kannada characters and their similarity makes the problem one order of magnitude more difficult than for a standard language like English. We propose a segmentation technique to decompose each character into components from 3 base classes, thus reducing the...
Clustering, an important technique of data mining, groups similar objects together and identifies the cluster number to which each object of the domain being studied belongs to. In this paper we propose a clustering algorithm which produces quite accurate clusters using the bottom up approach of hierarchical clustering technique of data with categorical attributes. A similarity measure has been proposed...
Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images...
Nowadays there still exists many problems with regard to student grants in higher education in practice, especially the judgement standard related to poor students. Aiming at the issue, we make use of the data from campus smart card system of certain college, and carry out data mining of poor students to derive groups with different characteristics by clustering analysis. Different groups have adopted...
Clustering organizes text in an unsupervised fashion. In this paper, we propose an algorithm for the fuzzy clustering of text documents using the naive Bayesian concept. Fuzzy clustering implies that the text documents are assigned to multiple clusters, ranked in descending order of probability. The Vector Space Model is used to represent our dataset as a term-weight matrix. In any natural language,...
This paper presents an applied study in data mining and knowledge discovery. It aims at discovering patterns within historical students' academic and financial data at UST (University of Science and Technology) from the year 1993 to 2005 in order to contribute improving academic performance at UST. Results show that these rules concentrate on three main issues, students' academic achievements (successes...
K-means Clustering is an important algorithm for identifying the structure in data. K-means is the simplest clustering algorithm. This algorithm uses predefined number of clusters as input. The original algorithm is based on random selection of cluster centers and iteratively improving the results. However there are two major limitations in this approach. First, the need for number of clusters in...
Density based clustering algorithms are one of the primary method for data mining. The clusters which are formed using density clustering are easy to understand and it does limit itself to shapes of clusters. Existing density based algorithms have trouble because they are not capable of finding out all meaningful clusters whenever the density is so much varied. VDBSCAN is introduced to compensate...
To enable effective access to databases on the Web, it is critical to integrate the large scale deep Web sources. Therefore, schema matching is a basic problem in many database application domains, such as data integration, E-business, data warehousing, and semantic query processing. In current implementations, schema matching has some significant limitations until now. And also, there are some problems...
The most important facts in educational institutional system growth lies in the quality of services rendered. (i.e., faculty profile, student performance and infrastructure requirements). The highest level of quality in educational institution can be achieved by utilizing the managerial decision makers with valuable implicit knowledge, which is currently unknown /hidden to them. The knowledge hidden...
A comprehensive survey on patch recognition, which is a crucial part of content-based image retrieval (CBIR), is presented. CBIR can be viewed as a methodology in which three correlated modules including patch sampling, characterizing, and recognizing are employed. This paper aims to evaluate meaningful models for one of the most challenging problems in image understanding, specifically, for the effective...
More and more data in practice is changing every minute and been collected in incremental mode, and incremental clustering has attracted much of researchers' attention. However, little research now focuses on partitioning categorical data in incremental mode. How to design incremental clustering for categorical data is an urgent problem. We propose an incremental clustering for categorical data using...
The rapid advances in wireless devices and positioning technologies boost various kinds of location-based services. Location-based push service plays an important role in the mobile environment as it foresees useful information to mobile users according to their current location, but the large number and continuous location changing of users make the system load heavy. This paper first proposes a...
Gastric impedance spectroscopy has been proposed as a method of monitoring mucosal injury due to hypoperfusion and ischemia in the critically ill. During validation tests for this procedure, it was found that 60% of the measurements had errors by factors inherent to the clinical setting, indicating that some kind of automatic error detection should be incorporated to potentially avoid the loss of...
A method that improves the feature selection stage for non-supervised analysis of Holter ECG signals is presented. The method corresponds to WPCA approach developed mainly in two stages. First, the weighting of the feature set through a weight vector based on M-inner product as distance measure and a quadratic optimization function. The second one is the linear projection of weighted data using principal...
Multimedia search engines are often based on multiple decentralized search services, multiple information sources (text search, audio search, visual search, semantic search engines, etc.), multiple data representation and similarity measures. Heterogeneous multiple search results need to be combined and structured efficiently and generically. In this paper, we propose a new multiple search results...
In this paper, we address dynamic clustering in high dimensional data or feature spaces as an optimization problem where multi-dimensional particle swarm optimization (MD PSO) is used to find out the true number of clusters, while fractional global best formation (FGBF) is applied to avoid local optima. Based on these techniques we then present a novel and personalized long-term ECG classification...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.