The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clustering is the unsupervised classification of patterns into groups. A clustering algorithm partitions a data set into several groups such that similarity within a group is larger than among groups The clustering problem has been addressed in many contexts and by researchers in many disciplines, this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. There...
Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. Identification of co-expressed genes and coherent patterns is the central goal in microarray or gene expression data analysis and is an important task in bioinformatics research. In this work the unsupervised Gene selection methods and CCIA with K-Means algorithms...
Fast retrieval of relevant information from the databases has always been a significant issue. Different techniques have been developed for this purpose; one of them is Data mining. Clustering analysis is a key and easy tool in data mining and pattern recognition. In this paper K-Mean clustering is used for evaluating the performance of socially and economically backward group of people, self help...
In order to improve terminal arrival route, increase terminal area capacity, enhance security level of air traffic management and air traffic controller's operating efficiency. Actual trajectory data were clustered based on kmeans algorithm. Trajectory data is assigned to several different clusters. Mean arrival routes can be obtained from trajectory clustering and there are deviation between mean...
Real world applications are increasingly growing in the field of science and engineering, where data mining is an important stage to relate research and applications. Data objects are clustered based on the similarity using unsupervised learning techniques. The incomplete, noisy and inconsistent data may slow down the knowledge discovery in database process. Data preprocessing techniques improve the...
Patterns and classification of stock or inventory data is very important for decision making and business support. In this paper we proposed an algorithm for mining patterns of huge stock data to predict factors affecting the sale of products. Identification of sales patterns from inventory data indicate the market trends which can further be used for forecasting, decision making and strategic planning...
Traditional machine learning methods for intrusiondetection can only detect known attacks since these methodsclassify data based on what they have learned. New attacks areunknown and are difficult to detect because they have notlearned. In this paper, we present an improved k-meansclustering-based intrusion detection method, which trains onunlabeled data in order to detect new attacks. The result...
In this paper, we propose a new data mining algorithm, which is used in surveillance video of stationary places. The algorithm combines Background Subtraction with Symmetrical Differencing in order to extract moving targets. According to the amount of motions occurring in video frames, we divide the video into different segments. Video segments are clustered via the improved K-Means algorithm. Then...
State of the art research in data mining is focusing on loosely distributed regionalized large scale databases using cloud computing for business applications. Cloud computing poses a diversity of challenges in data mining operation arising out of the dynamic structure of data distribution as against the use of typical database scenarios in conventional architecture. Realization of maximum efficiency...
Affinity propagation clustering algorithm is with a broad value in science and engineering because of it no need to input the number of clusters in advances, robustness and good generalization. But the algorithm needs the initial similarity (the distance between any two points) as a parameter, a lot of time and storage space is required for the calculation of similarity. It's limited to apply to cluster...
This paper proposes a new clustering technique based on elements of rough set theory (RST), for an information system which contains only input information (condition attributes) but without decision (class attribute). The proposed algorithm is unified in its approach to clustering and makes use of both local and global data properties to obtain clustering solutions. The results from some data sets...
Data mining can efficiently deal with the large number of historical and current data, from the database can find some potential, useful and valuable information for the retail stores. The paper takes a large retail supermarket as its study object, use data mining methods to retail enterprise customer segments, and then use association rules to different groups of customer and get rules about customer...
This paper presents a data clustering approach using modified K-Means algorithm based on the improvement of the sensitivity of initial center (seed point) of clusters. This algorithm partitions the whole space into different segments and calculates the frequency of data point in each segment. The segment which shows maximum frequency of data point will have the maximum probability to contain the centroid...
Clustering is the process of grouping a set of objects into classes. The clustering problem has been addressed by researchers in many contexts and disciplines. First, a process model for data mining and the typical requirements of clustering methods have been described. Second, the k-means algorithm and its advantages and disadvantages are introduced. Then the Iris dataset is used to specify the k-means...
This paper analyses the users' group interests by mining the internet browsing history. To count the visiting information of the interests' categories, visiting time and the number of users, get to the regularity of conclusion. Then, it has put forward an improved HAC (hierarchical agglomerative clustering) and k-means algorithm to cluster the users by their interests, to mine the users' access mode...
Recommendation systems have been investigated and implemented in many aspects. Particularly, in case of collaborative filtering system, more important issue is how to manipulate the personalized recommendation results for better user understandability and satisfaction. Collaborative filtering system predicts items of interest for users based on predictive relationship discovered between the item and...
Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data". Clustering is the automated search for group of related observations in a data set. The K-Means method is one of the most commonly used clustering techniques for a variety of applications. This paper proposes a method for making the K-Means algorithm...
Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. Within a gene expression matrix there are usually several particular macroscopic phenotypes of samples. Selection of genes most relevant and informative for certain phenotypes is an important aspect in gene expression analysis. Currently most of the research...
Conventional k-means only considers pair wise similarity during cluster assignment, which aims to minimizing the distance of points to their nearest cluster centroids. In high dimensional space like document datasets, however, two points may be nearest neighbors without belonging to the same class. Thus pair wise similarity alone is often insufficient for class prediction in such space. To that end,...
Since ancient ritual is usually near the water, so, there are large quantities of precious cultural relic buried ancient river. Search for ancient river has a great significance in the archaeological exploration. Traditional methods of the search ancient river entirely by manual interpretation. Interpretation by using a lot of consecutive geological cross-section diagram. The interpretation work need...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.