The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The purpose of this study is to clarify the applicability of data-driven approach in accounting area. As the first stage, focusing on the model comparison, this paper shows the effectiveness of model selection with data mining technique for the development of earnings prediction model based on financial statement data. In accounting area, researchers have not considered the characteristic of financial...
For the quality of the wine big data identification technology, the introduction of data mining classification algorithm, effectively according to the content of several impact compounds in wine level identification;Are introduced including the Logistic regression and BP neural network and SVM classification algorithm, in view of the three algorithms identify the modeling analysis of wine quality...
The present paper presents a novel approach for semi-supervised classification of remote sensing imagery using {K-Means+(GMM-EM)} clustering cascade followed by selection of an amount of clustered pixels to be added to the training set according to their GMM responsibilities. The proposed method has the following steps: (a) clustering of the multispectral pixels using the cascade composed by K-means...
With the growing interest in research in poker, by scientists belonging to the area of Artificial Intelligence, has arisen the need to overcome the imperfect information of the same, as a stochastic game, by the challenges that this enables. In this work we intend to create data models that allow us to know the plays of a real player, supporting the decision to play in the pre-flop phase, within the...
The back-end database is pivotal to the storage of the massive size of big data Internet exchanges stemming from cloud-hosted web applications to Internet of Things (IoT) smart devices. Structured Query Language (SQL) Injection Attack (SQLIA) remains an intruder's exploit of choice on vulnerable web applications to pilfer confidential data from the database with potentially damaging consequences....
Traditional data stream classification techniques assume that the stream of data is generated from a single non-stationary process. On the contrary, a recently introduced problem setting, referred to as Multistream Classification involves two independent non-stationary data generating processes. One of them is the source stream that continuously generates labeled data instances. The other one is the...
Now a days people are enjoying the world of data because size and amount of the data has tremendously increased which acts like an invitation to Big data. But some of the classifier techniques like Support Vector Machine (SVM) is not able to handle the huge amount of data due to it's excessive memory requirement and unreasonable complexity in algorithm tough it is one of the most popularly used classifier...
In order to analyze a news article dataset, we first extract important information such as title, date, and paragraph of the body. At the same time, we remove unnecessary information such as image, caption, footer, advertisement, navigation and recommended-news. The problem is that the formats of news articles are changing according to time and also they vary according to news source and even section...
Now a days people are enjoying the world of data because size and amount of the data has tremendously increased which acts like an invitation to Big data. But some of the classifier techniques like Support Vector Machine (SVM) is not able to handle the huge amount of data due to it's excessive memory requirement and unreasonable complexity in algorithm tough it is one of the most popularly used classifier...
Efficient and accurate data mining has become vital as technology advancements in data collection and storage soar. Researchers have proposed various valuable machine learning algorithms for data mining. However, not many have utilized formal methods. This paper proposes a data mining approach using Probabilistic Context Free Grammars (PCFGs). In this work we have employed PCFGs to mine from large...
The recent computing trend is producing tons of data every minutes where the amount of imbalanced data is quite high as far as real life data sets are concerned. In practical aspects of data mining, the imbalanced data set is prone to misguide a data mining model. However, data set needs pre-processing before mining. This work focuses on some practical data mining techniques and produces a valid evaluation...
Cloud computing environments are growing in complexity creating more challenges for improved resilience and availability. Cloud computing research can benefit from machine learning and data mining by using data from actual operational cloud systems. One aspect that needs in-depth analysis is the failure characteristics of cloud environments. Failure is the main contributor to reduced resiliency of...
Proportion-SVM has been deeply studied due to its broad application prospects, such as modeling voting behaviors and spam filtering. However, the geometric information has been widely ignored. Thus, current methods usually show sensitivity to noises. To address these problems, in this paper, we combine the proportion learning framework with Laplacian term. We exploit the advantages of Laplacian term...
This paper presents a salary prediction system using a profile of graduated students as a model. A data mining technique is applied to generate a model to predict a salary for individual students who have similar attributes to the training data. In this work, we also made an experiment to compare five data mining techniques including Decision trees, Naive Bayes, K-Nearest neighbor, Support vector...
CNNs (convolutional neural networks) have been proved to be efficient deep learning models that can directly extract high level features from raw data. In this paper, a novel CCS (Cube-CNN-SVM) method is proposed for hyperspectral image classification, which is a spectral-spatial feature based hybrid model of CNN and SVM (support vector machine). Different from most of traditional methods that only...
Today, the use of learning analytics is becoming more crucial in the learning environment for the purpose of understanding and optimizing students' learning situations. The purpose of this paper is to examine the impacts of Teacher Interventions (TIs) on students' attitudes and achievements involved with the lesson by analyzing their freestyle comment data after every lesson. The current study proposes...
While addressing real-world issues, there is a significant quantity of domain knowledge available in prior which helps in yielding different perspectives on various characteristics related to the issue. At the same time, several types of machine learning methods do not depend on such prior explicitly expressed domain information. However, such methods require especially in case of operating learning...
Data streams are rapidly and constantly growing. Analysis of rapidly changing data streams is quite difficult since the amount of data increases in timely manner. Individual patient records provide a vital resource for health research for the benefit of society, such as understanding the association between human immune system and viruses. As the patient records have been constantly growing, data...
The ν-nonparallel support vector machine (ν-SNPSVM) for classification has the advantage of using a parameter ν on controlling the number of support vectors. However, it ignores the prior structural information in data. In this paper, we propose a novel nonparallel classifier, named ν-Structural Nonparallel Support Vector Machine (ν-SNPSVM), for binary classification. Each model of ν-SNPSVM considers...
Twitter right now gets around 190 million tweets(little content based Web posts) a day, in which individualsshare their remarks with respect to an extensive variety ofsubjects. An expansive number of tweets incorporatesentiments about items and administrations. Notwithstanding, with Twitter being a moderately new wonder, these tweets areunderutilized as a hotspot for assessing client supposition andhave...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.