The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present a comparative study for discriminative anatomy detection in high dimensional neuroimaging data. While most studies solve this problem using mass univariate approaches, recent works show better accuracy and variable selection using a sparse classification model. Two types of image-based regularization methods have been proposed in the literature based on either a Graph Net (GN) model or...
Feature selection is an important task in data mining, which aims to reduce the dimensionality of the data sets while at least maintaining the classification performance. Chicken swarm optimization algorithm (CSO) has been widely applied to feature selection because of its efficiency and effectiveness. However, since feature selection is a challenging task with a complex search space, CSO quickly...
In this paper, we have employed a machine learning approach for automatic classification of healthy and pathological gait signals and subsequent identification of the neurological disorder in the pathological gait signals. The machine learning algorithm we have proposed is the Logit model of the Logical Regression Classifier. As the process of walking is automatically controlled by the nervous system...
Improvements of energy efficiency and reduction of electricity consumption can be promoted by growing knowledge on the determinants of residential electricity consumption level (RECL). Due to numerousness, complexity and multiple correlations among impact factors (IFs) of RECL, feature selection is an essential step to ensure the precision and stability of an explanatory model. However, the current...
Keyword extraction is an automated process that collects a set of terms, illustrating an overview of the document. The term is defined how the keyword identifies the core information of a particular document. Analyzing huge number of documents to find out the relevant information, keyword extraction will be the key approach. This approach will help us to understand the depth of it even before we read...
The purpose of this study is to clarify the applicability of data-driven approach in accounting area. As the first stage, focusing on the model comparison, this paper shows the effectiveness of model selection with data mining technique for the development of earnings prediction model based on financial statement data. In accounting area, researchers have not considered the characteristic of financial...
This paper aims to classify normal/abnormal heart sound signals from PhysioNet/CinC challenge 2016. The heart sound signals are segmented into four states, i.e., the first heart sound, systolic interval, the second heart sound, and diastolic interval. Multi-features are extracted from time domain, frequency domain and entropy, which are formed into three features sets. The first features set includes...
Most of the existing works on human activity analysis focus on recognition or early recognition of the activity labels from complete or partial observations. Predicting the labels of future unobserved activities where no frames of the predicted activities have been observed is a challenging problem, with important applications, which has not been explored much. Associated with the future label prediction...
This paper introduces an ensemble model that solves the binary classification problem by incorporating the basic Logistic Regression with the two recent advanced paradigms: extreme gradient boosted decision trees (xgboost) and deep learning. To obtain the best result when integrating sub-models, we introduce a solution to split and select sets of features for the sub-model training. In addition to...
Employee churn prediction which is closely related to customer churn prediction is a major issue of the companies. Despite the importance of the issue, there is few attention in the literature about. In this study, we applied well-known classification methods including, Decision Tree, Logistic Regression, SVM, KNN, Random Forest, and Naive Bayes methods on the HR data. Then, we analyze the results...
Previous studies have found that a significant number of bug reports are misclassified between bugs and nonbugs, and that manually classifying bug reports is a timeconsuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to extract...
Data amount becomes rapidly increased in today's era. Data can be in form of text, picture, voice, and video. Social media is one factor of the data increase as everybody expresses, gives opinion, and even complains in social media. The first step is data collection used API twitter with each candidate names on Jakarta Governor Election. The collected data then became input for preprocessing step...
The increasing availability of relevant information, events and constraints in the environment of the modern factories due to deployment of IoT sensor technologies on the production line has led to an “explosion” in contextual big data. At the same time the advancements in the machine learning field from the last years opened new approaches for the analysis of the manufacturing processes datasets...
The interpretability of prediction mechanisms with respect to the underlying prediction problem is often unclear. While several studies have focused on developing prediction models with meaningful parameters, the causal relationships between the predictors and the actual prediction have not been considered. Here, we connect the underlying causal structure of a data generation process and the causal...
Fraudulent activities in financial institutes can break the economic system of the country. These activities can be identified using clustering and classification algorithms. Effectiveness of these algorithms depend on quality of the input data. Moreover, financial data comes from various sources and forms such as financial statements, stakeholders activities and others. This data from various sources...
High dropout rate of MOOC is criticized while a dramatically increasing number of learners are appealed to these online learning platforms. Various works have been done on analysis and prediction of dropout. Machine learning techniques are widely applied to this field. However, a single classifier may not always perform reliable for predictions. In this work, we study dropout prediction for MOOC....
In this work we compare different classification algorithms applied on different number of features (linear predictive coding coefficients) in order to detect audio signals from wildlife areas. The final goal is to find the appropriate number of linear predictive coding coefficients to provide the desired accuracy for a certain framework. The experimental results prove that the best classifier is...
Hyperspectral image (HSI) is usually composed of hundreds of bands which contain very rich spatial and spectral information. However, the high-dimensional data may lead to the curse of dimensionality phenomenon when it is used for land use classification or other applications, making it difficult to be utilized effectively. In this paper, we developed a deep learning classification framework based...
Chatter is a kind of unstable vibration in metal removal processes, which causes severe damage to the workpiece and machine tool. To detect cutting chatter at an early stage, we propose a new chatter identification method based on status classification with logistics regression (LR) in this paper. The classification is based on the feature vector that is composed of the autocorrelation and improved...
In recent years, due to growing interest in automated driving, the need for better understanding the humans driving behavior, and particularly the lane changing and car following behavior, has further increased. Despite its great importance, lane changing has not been studied as extensively as longitudinal behavior and remains one of the most challenging driving behavior maneuvers to understand and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.