The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Outlier detection in high-dimensional space is a hot topic in data mining, the main goal is to find out a small quantity of data objects with abnormal behavior in data set. In this paper, the concepts of the feature vector and the attribute similarity are defined, an improved algorithm SWHOT based on weighed hypergraph model for outlier detection in high dimensional space is presented. The objects...
Pattern mining on frequent closed sequence is one of hot spots in data mining. Aimed at the set of frequent closed sequential patterns being mined more efficient, PBIDE is proposed, which is the BIDE algorithm based on the expansion of position. The position information, expansion event, and optimization strategy are defined. Firstly, the numbers of different sequences identities are calculated in...
With the development of the technique of XML, how to make use of database to store and query XML documents has become a hot topic. In this paper, a labeling scheme and a storage method of XML documents based-on this labeling scheme are proposed. According to the node types, this method decomposes the document tree structure into nodes and stores them into the relational table; it enables us to store...
In true-life the database is changed continually in many applications. Incremental mining technique has been developed to avoid rescanning database for knowledge discovery. Recent and compact constraints also are developed for frequent patterns mining. We store the database with a time-vertical bitmap representation, therefore the supports of frequent pattern and recent pattern can be computed fast...
In true-life, the existence of many events which are occurred in the interval may cause uncertainty in events ordering. The inaccurate event has been introduced for sequential pattern mining to improve accuracy of computing support threshold. In this paper, we store a sequence in the chain table. Sequence with inaccurate event can be expressed expediently. Besides, precise support is introduced to...
As one of the most important problems in data streams mining, many studies have been done on mining closed frequent itemsets. However mining closed frequent itemsets in data streams has not been well addressed. In this paper, we design HCI-Mtree (Hash-based Closed Itemsets Monolayer tree) to maintain the complete set of current closed itemsets. In HCI-Mtree, the itemsets with the same frequency are...
Frequent pattern mining is fundamental to many important data mining tasks. Many researchers had presented many mining methods in static database. Due to many special characters of data stream, those methods fail to be used in dynamic environment. We develop a novel method mining frequent items from data stream based on sliding window model. We use some compact data structures which make uses of the...
Two novel sampling approaches are proposed to obtain a random sample of exact streaming window join result. Without assuming any model of stream arrivals, the frequency of join attribute values for various basic periods can be obtained by a frequency balanced binary tree histogram (FATH) which is constructed for each stream. The frequency for the future window can be computed by linear regression...
Although frequent traversal sequence (FTS) mining has been extensively studied over the last decade in web usage mining, it is challenging to extend the mining technique to dynamic web click streams. The main challenge is that existing false-positive methods control memory consumption and output accuracy by a relaxation ratio r (r = e/s, e is the error parameter, and s is the specified minimum support)...
An important application of sequential mining technique is frequent traversal sequence (FTS) mining. However, the Web data grows quickly, some data may be outdated, and previous FTS may be changed when the database is updated. We have to re-mine FTS from the updated database, but re-finding FTS consume too much execution time. In this paper, a novel structure, IE-LATTICE (improved extended lattice)...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.