The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The explosive growth of the insurance data puts forward a higher demand for the data processing of the internet of things. This paper adopts distributed database as the underlying data storage, combines with array-based apriori technology on the basis of data mining, applies and queries, and proposes a type of array-based apriori optimization algorithm. Array-based data layout, first stacks data arrays,...
In the data mining research area, discovering frequent item sets is an important issue and key factor for mining association rules. For large datasets, a huge amount of frequent patterns are generated for a low support value, which is a major challenge in frequent pattern mining tasks. A Maximal frequent pattern mining task helps to resolve this problem since a maximal frequent pattern contains information...
It has been observed that sometimes in Numeric Association Rule Mining (NARM), it is important to understand the association between a numeric attribute and a specific categorical consequent class attribute. NARM divides the domain of numeric attributes sub-domains without considering particular categorical consequent class attribute. Apart from this, it may also suffer from support-confidence conflict...
At present, due to the developments in Database Technology, large volumes of data are produced by everyday operations and they have introduced the necessity of representing the data in High Dimensional Datasets. Discovering Frequent Determinant Patterns and Association Rules from these High Dimensional Datasets has become very tedious since these databases contain large number of different attributes...
Age estimation is a complex issue of multiclassification or regression. To address a common problem of uneven distribution of age database, this paper shows a hierarchic age estimation system, comprising age group and specific age estimation. In our system, two novel classifiers, Sequence K-Nearest Neighbor (SKNN) and Ranking-KNN, are introduced to predict age group and age value respectively. Notably,...
The association rule mining algorithm Apriori need to repeatedly scan the transaction database and a lot of I/O loads, moreover it may generate huge candidate sets, the complexity of time and space is relatively high. Aiming at the limitation of the algorithm, an algorithm is proposed for association rule mining based on matching array. The algorithm only needs to scan the database once, screens out...
The main difference of the associative classification algorithms is how to mine frequent item sets, analyze the rules exported and use for classification. This paper presents an associative classification algorithm based on Trie-tree that named CARPT, which remove the frequent items that cannot generate frequent rules directly by adding the count of class labels. And we compress the storage of database...
Mining frequent pattern from spatial databases systems has always remained a challenge for researchers. However, the performance of SQL based spatial data mining is known to fall behind specialized implementation since the prohibitive nature of the cost associated with extracting knowledge, and the lack of suitable declarative query language support. In this paper, we proposed an enhancement of existing...
Mining spatial association rules is one of the most important branches in the field of Spatial Data Mining (SDM). Because of the complexity of spatial data, a traditional method in extracting spatial association rules is to transform spatial database into general transaction database. The Apriori algorithm is one of the most commonly used methods in mining association rules at present. But a shortcoming...
Mining of frequent patterns is a basic problem in data mining applications. The algorithms which are used to generate the frequent patterns must perform efficiently. The objective was to propose a new algorithm which generates maximal frequent patterns in less time. We proposed an algorithm which was based on Array technique and combines a vertical tidset representation of the database with effective...
During software development, design rules and contracts in the source code are often encoded through regularities, such as API usage protocols, coding idioms and naming conventions. The structural regularities that govern a program can aid in comprehension and maintenance of the application, but are often implicit or undocumented. Tool support for extracting these regularities from the source code...
Association rules are the very valuable kind of law in data mining. The fitness of time is seldom illustrated by traditional association rules, which losses a number of useful implicit rules. On the basis of further study of other association rules mining algorithms, this paper has developed Apriori-extended mining periodic temporal association rules (MPTAR) according to the especial periodicity of...
In order to improve efficiency of excavation in relational database with multi-dimensional association rules, this paper analyzed Apriori algorithm and BUC algorithm based on practice. Then an improved Apriori algorithm-DGP algorithm which based on the multidimensional association rule was presented, it has more efficient and it will be used in the relational database. At last it was applied for analyzing...
We have previously proposed the high utility pattern (HUP) tree for utility mining. In this paper, we further handle the problem of maintaining the HUP tree in dynamic databases. A HUP maintenance algorithm has thus been proposed for efficiently handling new transactions. The proposed algorithm can reduce the cost of re-constructing the HUP tree when new transactions are inserted. Experimental results...
The first step of detecting relating rule is a process at the cost of much system resource, so the total performance of detecting relating rule is determined by this step. The main goal of existing algorithms at present is focused on the efficiency of this step. DHP and PHP are included in those algorithms and these two algorithms are the transmutations of Apriori algorithm to improve its validity...
Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. In this paper, we integrate the merits of the matrix algorithm and Index-BitTableFI algorithm, and design an efficient algorithm for mining the frequent itemsets. In the new algorithm, it may be generated directly some frequent itemsets which do not generate in the...
Several algorithms have already been developed for association rule mining. In Apriori algorithm, if the number of candidate sets are increased, the efficiency of the algorithm decreases. To overcome this, MPIP algorithm, proposes perfect hash function in the initial stages of the algorithm. Here, we propose perfect hash functions for 2- itemsets and 3-itemsets. The function depends on the number...
Genetic algorithm is an important algorithm of association rule mining. However, there is some issues that genetic algorithm easy to lead prematuring convergence and into the plight of local optimum, or convergence too much time and consume a large amount of time to search. For resolving this issues, the paper improves the algorithm through adopting an adaptive mutation rate and improving the methods...
Mining frequent closed patterns play an important role in mining association rules in microarray data. The bottom-up search strategy for mining frequent closed patterns cannot make full use of minimum support threshold to prune search space and results in long runtime and much memory overhead. TP+close algorithm based on top-down search strategy addressed the problem. However, it determined a frequent...
A mining top-n frequent closed itemsets of length no less than min_l algorithm is introduced by this paper, where n is the desired number of frequent closed itemsets to be mined, and min_l is the minimal length of each itemset. An efficient algorithm, called TFP, is developed for mining such itemsets without mins_support. Starting at min_support=0 and by making use of the length constraint and the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.