Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
The date mining based on big data was a very important field. In order to improve the mining efficiency, the mining algorithm of frequent itemsets based on mapreduce and FP-tree was proposed, namely, MAFIM algorithm. Firstly, the data were distributed by mapreduce. Secondly, local frequent itemsets were computed by FP-tree. Thirdly, the mining results were combined by the center node. Finally, global...
Recently, multi-label classification has gained prime importance among the classification problems. The applications of classification problems has increased so rapidly that the need for efficient and accurate classifiers has become a vital requirement in the area of data mining. Multi-label classification problem is distinguished from the single label classification because of the capability to handle...
With the advance of mobile electronic devices and the development of positioning technology, a large volume of spatio-temopral data are collected in the form of desultorily data streams, which contain a lot of potential information. In this study, we focus on discovering the composition relationships between observation moving objects in a long period. Such research can be widely used in military...
Frequent Itemset Mining (FIM) is one of the classical and well-adopted descriptive approaches in data mining. However classical algorithms of FIM like Apriori method suffer from higher I/O overhead, inability to scale-up for higher items (dimensions) and higher computational time requirements. To overcome the I/O overhead, Partition algorithm is proposed; however, it does not scale-up well for large...
Crowdsourcing data is an essential part of information collection in healthcare. Patient data serves as the foundation for creating healthcare policy, creating new pharmaceuticals, and determining treatment. In this paper, we propose a novel conceptual method of standardizing and classifying the crowdsourcing of healthcare data using modular ontologies, authoritative medical ontologies (AMOs) and...
Data mining uses various algorithms for searching interesting information and hidden patterns from the large database. Traditional frequent itemset mining (FIM) generate large amount of frequent itemset without considering the quantity and profit of item purchased. High utility itemset mining (HUIM) gives advantageous results as compared to the frequent itemset mining. HUIM algorithm helps to improve...
Information technologies have allowed for the rapid growth of both data acquisition and data storage. With this growth comes the challenge of extracting useful information. One piece of information that is interesting to academics and industry is the relationships between items in a large data set. One approach is to find the relationships between items by calculating how frequently the items appear...
In this paper, we propose a mining algorithm for average-utility itemsets (EHAUI-Tree) based on improving HUUI-Tree algorithm to apply for adding new database transactions without restart. At first, the value of updated data is calculated. Then, itemsets which make changes will be calculated and updated depending upon the updated data value and the previous High Average-utility Upper-bound (HAUUB)...
With complex pathogenesis, Chronic Obstructive Pulmonary Disease (COPD) is difficult to treat. Traditional Chinese Medicine (TCM) showed obvious effect in treating COPD. However, invaluable TCM experience lacks of systematic summarization and study. Association rule is used to discover the relationships among data items in a large amount of data. Because of clear and useful results, association rule...
Erasable-itemset (EI) mining is to find the itemsets that can be eliminated but do not greatly affect the factory's profit. In this paper, an incremental mining algorithm for erasable itemset is proposed. It is based on the concept of the fast-update (FUP) approach, which was originally designed for association mining. Experimental results show that the proposed algorithm executes faster than the...
Higher Education Institutions store a sizable amount of data, including student records and the structure of a degree curriculum. This paper focuses on the problem of identifying how closely students follow the recommended order of the courses in a degree curriculum, and to what extent their performance is affected by the order they actually adopt. It addresses this problem by applying techniques...
Mining Inter-transaction patterns (ITPs) from large databases is a common data mining task, which discovers the patterns across several transactions in a transaction database. Although, several algorithms have been proposed for this task, they remain computationally expensive. To resolve this issue, this paper presents an efficient method called DITP-Miner to mine ITPs. In our proposed algorithm,...
Whereas purest strategic games such as <bold>Go</bold> and <bold>Chess</bold> seem timeless, the lifetime of a video game is short, influenced by popular culture, trends, boredom, and technological innovations. Even the important budget and developments allocated by editors cannot guarantee a timeless success. Instead, novelties and corrections are proposed to extend an inevitably...
Automatically extracting phenotypes (i.e., the composite of ones observable characteristics/traits) from free text such as scientific literature or clinical notes and associating phenotypes with diseases is an important task. Such associations can be used in, for example, recommending candidate genes for diseases, investigating drug targets, or performing differential diagnosis. In this paper, we...
Association rule mining is one of the popular topics in data mining. It can be applied with various types of applications. In these days, an organization applies multiple software applications to manage its jobs. These applications are also based on several types of platforms. Hence, an interoperability software development using web service becomes one of popular topics in nowadays. In this paper,...
Frequent itemsets discovery is popular in database communities recently. Because real data is often affected by noise, in this paper, we study to find frequent itemsets over probabilistic database under the Possible World Semantics. It is challenging because there may be exponential number of possible worlds for probabilistic database. Although several efficient algorithms are proposed in the literature,...
Fuzzy association rules are one of the most important data mining techniques. They allow to discover useful and meaningful information that help in decision-making. Many algorithms have been proposed to extract fuzzy association rules. A major drawback of these proposed algorithms is their high run-time for extracting fuzzy association rules. To overcome this problem, we introduce in this paper a...
Eclat algorithm is one of the most widely used frequent itemset mining methods. One significant bottleneck of the Eclat algorithm is that the efficiency for calculating the intersection of itemsets is low especially when the itemsets have a large number of transactions. In this work, for the purpose of efficiency improvement and resource saving, we propose an approximate variation of Eclat algorithm...
Advanced pattern mining to extract the hidden but useful information by using proper structure is vital important for efficient information mining in large-scale practical datasets. The existing algorithms have not been capable of effective solving the fuzziness uncertainty of items and confirming the appropriate structure of studied patterns. In order to generate more proper practical patterns, a...
This paper will give basic knowledge understanding how to discriminate between two datasets with Emerging Patterns (Eps) upon famous weather dataset. This paper didn't use previous data mining techniques such as border-based algorithm or so on, but only to give the systematic basic knowledge understanding to discriminate between two datasets by finding score of support, growthrate and confidence....
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.