The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The emergence and development of the Internet resulted in the generation of huge amounts of data, which are often distributed among different sites. Many organizations and companies attempted to mine the data with cloud computing. However, given the rise of various privacy issues, sensitive data (e.g., medical records) need to be encrypted before outsourcing to the cloud. To process data mining, such...
Machine learning and data mining techniques have been widely used in order to improve network intrusion detection in recent years. These techniques make it possible to automate anomaly detection in network traffics. One of the major problems that researchers are facing is the lack of published data available for research purposes. The KDD'99 dataset was used by researchers for over a decade even though...
Recent developments in data mining and machine learning have helped to solve many issues in prediction and recommendation. In this project, we run a comprehensive study on individual behavior patterns from call detail records (CDR) data to predict tourists' future stops. Multiple classification algorithms are employed, including Decision Tree, Random Forest, Neural Network, Naïve Bayes and SVM. In...
With the growing interest in research in poker, by scientists belonging to the area of Artificial Intelligence, has arisen the need to overcome the imperfect information of the same, as a stochastic game, by the challenges that this enables. In this work we intend to create data models that allow us to know the plays of a real player, supporting the decision to play in the pre-flop phase, within the...
The customer review is important to improve service for company, which have both close opinion and open opinion. The open opinion means the comment as text which shows emotion and comment directly from customer. However, the company has many contents or group to evaluation themselves by rating and total rating for a type of services which there are many customer who needs to review. The problem is...
The second largest cause of death in Palestine is Cancer at a rate 12.4% of all deaths. Predicting the survivability of a disease is one of the most interesting purposes of developing a medical data mining applications. This paper applies two classification models (Rule Induction and Random Forest) on the Gaza Strip 2011 cancer patient's dataset, to predict the survivability of cancer patients. The...
Opinion mining is an interested area of research, which epitomize the customer reviews of a product or service and express whether the opinions are positive or negative. Various methods have been proposed as classifiers for opinion mining such as Naïve Bayesian, and Support vector machine, these methods classify opinion without giving us the reasons about why the instance opinion is classified to...
The information systems are widely spread in most official institutions, and become certified in all areas of our life such as education, health and entertainment. Usability is one of the most important factors, which encourages users to deal with these systems or refuse it. Data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. In this...
Algorithms used in data mining techniques are of great importance in the field of health care, especially in the case of getting patterns or models that are undiscovered in databases. In the area of health care, leukemia affects the blood status and can be discovered by using the Blood Cell Counter (CBC). This study aims to predict the leukemia existence by determining the relationships of blood properties...
In healthcare systems, there is huge medical data collected from many medical tests which conducted in many domains. Much research has been done to generate knowledge from medical data by using data mining techniques. However, there still needs to extract hidden information in the medical data, which can help in detecting diseases in the early stage or even before happening. In this study, we apply...
Anomaly detection is the process of finding outlying records from a given data set. The aim of this paper is to study a well-known anomaly detection technique on the “Short Message Service Centre” server, used in the telecommunications field to handle and store messages. This server was studied in details, a script was written to gather all the required data that went through a cleaning phase and...
Credit risk is related to the risk of the borrower that the lender will not be able to return their debt including interest. Numerous researches have been conducted in the area of credit risk, both using classical models such as Altman Z-score and using machine learning methodology. However, the research using the data from Croatian financial institutions is scarce, especially research focused on...
The incidence of hypertension associated with pregnancy contributes significantly to increase maternal and fetal deaths during pregnancy and childbirth. Due to its high incidence rate and several complications, the study of this disorder has been subject of numerous investigations in an attempt to determine its prevention and improve the treatment conduction. In this context, this paper uses a data...
For the modeling problem of microbial fermentation process, taking glutamic acid fermentation process as the research object, the decision tree and the random forest model were established by using the data mining method, and the model was evaluated and predicted by using the R language. Good effect of the decision tree model, indicating that the decision tree package of R language has a certain flexibility,...
Oriented graphs belong to a part of Mathematics - Combinatorics called Graph Theory. One of the fundamental terms here is a tree. The tree structures have widespread use not only in Mathematics. They can be used in Decision Theory as data mining tools as well. In the present paper we point out to the use of decision trees as models for financial services, namely, by credit scoring, fraud and churn...
As the software development community makes it easier to contribute to open source projects, the number of commits and pull requests keep increasing. However, this exciting growthrenders it more difficult to only accept quality contributions. Recent research has found that both technical and social factors predictthe success of project contributions on GitHub. We take this question a step further,...
This paper discusses the application and benefits of data mining techniques to construct prediction models in the field of corporate bankruptcy. It analyzes a dataset of 120 companies using different data mining techniques. Findings show that neural network is recommended as the best model to predict corporate bankruptcy. Findings also show that the proper use and selection of data mining techniques...
This paper aims to explore the potential application of advanced DM techniques for effective utilization of big building operational data. Case studies of mining the operational data of an institutional building for cooling load prediction and operation performance improvement is presented. Deep learning-based prediction techniques, decision tree and association rule mining are adopted to analyze...
Titanic disaster occurred 100 years ago on April 15, 1912, killing about 1500 passengers and crew members. The fateful incident still compel the researchers and analysts to understand what can have led to the survival of some passengers and demise of the others. With the use of machine learning methods and a dataset consisting of 891 rows in the train set and 418 rows in the test set, the research...
Machine learning and Data mining techniques are rapidly establishing themselves in medical and health care fields. This paper addresses a similar issue where the fitness of an individual can be predicted by analyzing few attributes associated with that individual. A hybrid classifier algorithm is developed by merging Decision Tree and Naïve Bayes algorithms which will classify the Fitness data set...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.