The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Researchers in higher education are beginning to explore the potential of data mining in analyzing data for the purpose of giving quality service and needs of their graduates. Thus, educational data mining emerges as one tools to study academic data to identify patterns and help for decision making affecting the education. This paper predicts the employability of IT graduates using nine variables...
Sexual assault and interpersonal violence affects university communities in disproportionate numbers to those of the general population. It is estimated that one in five women will be the victims of a sexual assault during their college years. In this study, we use kernel density estimation, logistic regression and random forest modeling to conduct spatial and temporal analysis of sexual assault at...
In big data universities, an understanding of how the individual learning style and preferences interacts with the instructional medium presented is needed. In this study we examined the VARK learning style inventory using the variable-centered, person-centered and social approaches. We worked on a big “data set” which encompasses two data sources the first was LMS while the second was social media...
This paper proposes a novel approach to select features that are jointly predictive of survival times and classification within subgroups. Both tasks are common but generally tackled independently in clinical data analysis. Here we propose an embedded feature selection to select common markers, i.e. genes, for both tasks seen as a multi-objective optimization. The Coxlogit model relies on a Cox proportional...
Prognostic modeling is central to medicine, as it is often used to predict patients' outcome and response to treatments and to identify important medical risk factors. Logistic regression is one of the most used approaches for clinical prediction modeling. Traumatic brain injury (TBI) is an important public health issue and a leading cause of death and disability worldwide. In this study, we adapt...
The objective of this paper is to describe the method of crops pattern change allocation, and to simulate the crops pattern in Heilongjiang province utilizing crop pattern simulator (CROPS) model. In this study, based on interpreted remote sensing data and crops pattern statistical data, CROPS model simulates long time series crop spatial pattern. Firstly, crops pattern and driving factor analysis...
The aim of this study was to investigate the use of a logistic model to determine the growth properties of Eichhornia crassipes (EC), in particular the area of EC. The area of EC was measured using Alos satellite images and GPS. The measurements were shown to lead to an accurate area-related logistic model of EC. This indicates that there is a possibility of obtaining the growth area of EC from a...
Humanitarian aid efforts in response to natural and man-made disasters often involve complicated logistical challenges. Problems such as communication failures, damaged infrastructure, violence, looting, and corrupt officials are examples of obstacles that aid organizations face. The inability to plan relief operations during disaster situations leads to greater human suffering and wasted resources...
Supervised learning is a commonly used tool for link prediction in social networks, where data imbalance is a major challenge because only a small portion of nodes may have social connections. In this paper, we propose to use a k-nearest neighbor sampling and a random sampling combined approach to address data imbalance issue for social link prediction. In our solution, we use two sampling approaches...
Credit is becoming one of the most important incomes of banking. Past studies indicate that the credit risk scoring model has been better for Logistic Regression and Neural Network. The purpose of this paper is to conduct a comparative study on the accuracy of classification models and reduce the credit risk. In this paper, we use data mining of enterprise software to construct four classification...
Cross-project defect prediction is very appealing because (i) it allows predicting defects in projects for which the availability of data is limited, and (ii) it allows producing generalizable prediction models. However, existing research suggests that cross-project prediction is particularly challenging and, due to heterogeneity of projects, prediction accuracy is not always very good. This paper...
The subjects of the study are listed petrochemical companies in China. We regard ST as a symbol of financial crisis for an enterprise. T-test and relevant linear test are applied to determine the model variables and Logistic regression to build the forecasting model of financial crisis, then the data of ST enterprise samples and non-ST enterprise samples are used for analysis. With the forecasting...
With the fast growing of outbound tourism visitors in Asia Pacific regions in recent years, how to provide better quality services to these travel consumers is a crucial problem that the tourism industry should most concern. A new global trend is that more and more people rely on innovative e-Tourism IT solutions to search travel information or make reservations through search engines and travel agents...
Many problems in machine learning involve variable-size structured data, such as sets, sequences, trees, and graphs. Generative (i.e. model based) kernels are well suited for handling structured data since they are able to capture their underlying structure by allowing the inclusion of prior information via specification of the source models. In this paper we focus on marginalisation kernels for variable...
A combining forecast model is proposed to evaluate the residential loan, which improves the accuracy of a single evaluation model. Firstly, the Relevance Vector Machine (RVM) model and logistic regression model are trained by the financial data respectively. Then the weighted average rule is used to fuse these two models based on a weight training procedure. Finally, the combining model is employed...
With the developments in information technology and improvements in communication channels, fraud is spreading all over the world, resulting in huge financial losses. Though fraud prevention mechanisms such as CHIP&PIN are developed, these mechanisms do not prevent the most common fraud types such as fraudulent credit card usages over virtual POS terminals through Internet or mail orders. As a...
Statistics plays an important role in many areas especially in classification tasks. Logistic Regression Model is one popular technique to solve problems, in particular, medical problems. P-Thalassemia, a common genetic disorder, lends itself to is interesting for using MLR to classify types of P-Thalassemia. There are several types of Thalassemia in the world, especially Thailand. From many methods...
We propose Fuzzy ART in personal credit area to the problem which can't handle discrete variables and continuous variables together and compared the outputs with logistic regression, linear programming and the BP neural network model results. The empirical result indicates that the model of Fuzzy ART category has less error II and possesses better applicable for commercial banks.
Ensemble learning aims to improve generalization ability by using multiple base learners. It is well-known that to construct a good ensemble, the base learners should be accurate as well as diverse. In this paper, unlabeled data is exploited to facilitate ensemble learning by helping augment the diversity among the base learners. Specifically, a semi-supervised ensemble method named UDEED is proposed...
The traditional DEA model didn't consider the importance of input and output indicators when it analyzed the relative effectiveness of each DMU. In fact, the importance of the various indicators is greatly different for the demand side in terms of logistics services. This paper chooses the various input and output indicators by AHP, Especially considers the difference of indicators for demand-side...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.