The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
By analyzing the disadvantages of the traditional KNN using lazy learning that directly classify the data based on the K neighboring classes using the majority voting method, a new Sigmoid weighted classification algorithm WKS (Weighted KNN Based On Sigmoid) was proposed. WKS provides a new method for learning and training, since each training data di ∊ D contributes to the correct classification...
Consider a face image data set from clients of a company and the problem of building a face recognition system from it. Video cameras can be used to acquire several images per client in order to maximize the robustness of the system. However, as the data set grows huge, the accuracy of the system might be seriously compromised since the number of negative samples for each user is increasing. We propose...
To realize Electrocardiography (ECG) signals monitoring systems, compressive sensing (CS) is a new technique to reduce power of biosensors and data transmission. Instead of spending high complexity on reconstructing back to data domain to do signal analysis, compressed analysis (CA) exploits the data structure preserved by CS to directly analyze in the compressed domain. However, compressively-sensed...
Text segmentation is an important problem in document analysis related applications. We address the problem of classifying connected components of a document image as text or non-text. Inspired from previous works in the literature, besides common size and shape related features extracted from the components, we also consider component images, without and with context information, as inputs of the...
Online opinions play an important role in supporting consumers make decisions about purchasing products or services. In addition, customer reviews allow companies to understand the strengths and limitations of their products and services, which aids in improving their marketing campaigns. Such valuable information can only be obtained via appropriate analysis of the opinions provided by customers...
On the basis of general pivot method for paraphrase extraction which might introduce much noise in extracted paraphrases, this paper proposes a syntactic knowledge-enhanced method to extract higher-quality paraphrases to further improve the quality of statistical machine translation. Firstly, the syntactic knowledge is acquired and added to paraphrase extraction algorithm as constraints to obtain...
In data classification mining, the decision tree method is a key algorithm. ID3 (Iterative Dichotomiser 3) algorithm which was presented by Quinlan is a famous decision tree algorithms, but ID3 has some shortcomings such as high complex computation in computing the information entropy expression, multivalue bios problem in the process of selecting an optimal attribute, large scales, etc. In order...
It is a simple task for humans to visually identify objects. However, computer-based image recognition remains challenging. In this paper we describe an approach for image recognition with specific focus on automated recognition of plants and flowers. The approach taken utilizes deep learning capabilities and unlike other approaches that focus on static images for feature classification, we utilize...
The paper exposes the behavior of the Decision Trees (DT) algorithms on a big database with many cases and many attributes: Forest Covertype (FC) from UCI Knowledge Discovery in Databases Archive. In classification experiments considered have been taken into account 22 splitting criteria and two pruning methods whose performances were presented in terms of classification error rate on test data, data...
One of the major causes of death in the world is Heart Failure. This disease affects directly the heart's pumping job. Because of this perturbation, nutriments and oxygen are not well circulated and distributed. The New York Heart Association has classified this disease into four different classes based on patient symptoms. In this paper, we are using a data mining technique, more precisely a sequential...
Keyword extraction is an automated process that collects a set of terms, illustrating an overview of the document. The term is defined how the keyword identifies the core information of a particular document. Analyzing huge number of documents to find out the relevant information, keyword extraction will be the key approach. This approach will help us to understand the depth of it even before we read...
The field of opinion mining is expanding rapidly with the widespread use of internet for e-commerce and social interaction. One of the interesting use of opinion mining is in the field of online producer-consumer industry. The primary goal of the work presented in this paper is to perform a semi-automated sentiment classification on online product reviews for product evaluation using machine learning...
In this paper, we propose to determine whether the viewer's behavior changes or not before, during and after watching a TV program. Are there any behaviors specific to each particular phase of viewing? Here, we propose a flexible and nonintrusive method based on the use of three categories of everyday connected objects (i.e. Smartphone, smartwatch and remote control). Data were collected during participants'...
Recognising detailed facial or clothing attributes in images of people is a challenging task for computer vision, especially when the training data are both in very large scale and extremely imbalanced among different attribute classes. To address this problem, we formulate a novel scheme for batch incremental hard sample mining of minority attribute classes from imbalanced large scale training data...
The topic of representation, classification, and clustering of text documents and information extraction is currently a very researched area. The area of data mining and text mining has its specific problems in the Slovak language. This paper deals with the methods of pre-processing of medical data, namely Slovak health records written in natural language, and their subsequent analysis, especially...
This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge Pit platform. We briefly describe the scope and background of this competition in the context of a more general project related to the development of an AI engine for video games, called Grail. We also discuss the outcomes of this challenge and...
This paper aims at construction of a system which assumes food textures. The system consists of equipment for obtaining the load and the sound signals while the probe is stabbing the food, and the neural network model infers the degree of the food texture. In the experiment, the validity of our proposed system is discussed.
Recently, multi-label classification has gained prime importance among the classification problems. The applications of classification problems has increased so rapidly that the need for efficient and accurate classifiers has become a vital requirement in the area of data mining. Multi-label classification problem is distinguished from the single label classification because of the capability to handle...
Extreme Learning Machine (ELM) is a neural network architecture with Single Layer Feed-forward Neural Network (SLFN). For meaningful results, the structure of ELM has to be optimized through the inclusion of regularization and the ℓ2 — norm based regularization is mostly used. ℓ2-norm based regularization achieves better performance than the traditional ELM. The estimate of the regularization parameter...
Traditional domain adaptation methods attempted to learn the shared representation for distribution matching between source domain and target domain where the individual information in both domains was not characterized. Such a solution suffers from the mixing problem of individual information with the shared features which considerably constrains the performance for domain adaptation. To relax this...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.