The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we document the face detection competition that we have organized in conjunction with the ISDA 2010 conference. The objective was to compare different face detection engines performance on new unpublished datasets. We believe researchers can benefit from this competition by identifying strong and weak areas in their algorithms relative to others. We have also identified, based on the...
This paper reports the investigations and experimental procedures conducted for designing an automatic sleep classification tool basedconly in the features extracted with wavelets from EEG, EMG and EOG (electro encephalo-mio- and oculo-gram) signals, without any visual aid or context-based evaluation. Real data collected from infants was processed and classified by several traditional and bio-inspired...
There are numerous problems of increasing significance where a pattern can have several classes simultaneously associated. This kind of problems, usually called multi-label problems, should be tackled with specific techniques in order to generate models more accurate than those obtained with classical classification algorithms. This work presents the adaptation of the J48 algorithm to multi-label...
The selection of a particular neural network model belonging to the Pareto front is a problem that exists in all multi-objective algorithms. This paper proposes a novel solution to this problem based on a linear combination of the outputs of the two extremes in the Pareto front, which form an ensemble. The decision support TOPSIS method is used to determine which linear combination creates the best...
This paper presents a method of applying text mining techniques and data mining tools for pharmaceutical spam detection from Twitter data. A simple method based on a manually selected list of 65 pharmaceutical discriminating words is used for labeling spam training tweets. Preliminary experimental results show that J48 decision tree classifier has better performance over Naïve Bayesian algorithm.
This article presents a clustering-based approach to fuzzy system identification. In order to construct an effective initial fuzzy model, this article tries to present a modular method to identify fuzzy systems based on a hybrid clustering-based technique. Moreover, the determination of the proper number of clusters and the appropriate location of clusters are one of primary considerations on constructing...
In the scope of Financial Watch project, several targeted events have been required by contacted users in banking and investment domains. Financial news are classified with respect of the list of desired events. In this paper, a conceptual approach for indexing short English news in the financial domain is presented. By using a supervised original learning approach, a categorization method is proposed...
Image thresholding is a very important phase in the image analysis process. However, different images have different characteristics making the traditional process of thresholding by one algorithm a very challenging task. That is because any thresholding method may be perform well for some images but for sure it will not be suitable for all images. In this paper, intelligent thresholding by training...
This paper studies the suitability of Extreme Learning Machines (ELM) for resolving bioinformatic and biomedical classification problems. In order to test their overall performance, an experimental study is presented based on five gene microarray datasets found in bioinformatic and biomedical domains. The Fast Correlation-Based Filter (FCBF) was applied in order to identify salient expression genes...
This paper aims to assess the effectiveness of three different clustering algorithms, used to detect breast cancer recurrent events. The performance of a classical k-means algorithm is compared with a much more sophisticated Self-Organizing Map (SOM-Kohonen network) and a cluster network, closely related to both k-means and SOM. The three clustering algorithms have been applied on a concrete breast...
The classification of imbalanced data is a well-studied topic in data mining. However, there is still a lack of understanding of the factors that make the problem difficult. In this work, we study the two main reasons that make the classification of imbalanced datasets complex: overlapping and data fracture. We present a Genetic Programming-based feature extraction method driven by Rough Set Theory...
Inspite of the huge amounts of image data on the web, mining image data from the web is paid less attention than mining text data, since treating the semantics of images is much more difficult. This paper introduces a new system to mine visual knowledge on the web that aims to build a Domain Oriented Image Directory by using the Earth Mover's Distance and Color signatures. Instead of using a flat...
Classification in imbalanced domains has become one of the most relevant problems within the area of Machine Learning at the present. This problem has raised in significance due to its presence in many real applications and it occurs when the distribution of the available examples to carry out the learning process is very different between the classes (often for binary class data-sets). Usually, the...
Gender recognition is a hot research topic in recent years. Human-machine interfaces or video surveillance can be greatly improved if human gender can be recognized automatically. In this study, an embedded hidden Markov model is used for gender recognition. Video, which is recorded in different angles of view, is utilized to sample properties of each gender. Ten consecutive gait frames are segmented...
This paper discusses the application of two unsupervised methods in classifying type of soils. Soils that are suitable for agricultural activities can be classified into four classes which are hill soil, organic soil, alteration soil and alluvium soil. In addition, no specific support system is able to classify the type of soil and retrieve the information for location and suitable plants for local...
Real life datasets often suffer from the problem of class imbalance, which thwarts supervised learning process. In such data sets examples of positive (minority) class are significantly less than those of negative (majority) class leading to severe class imbalance. Constructing high quality classifiers for such imbalanced training data sets is one of the major challenges in machine learning, since...
It is well known that the problem arising from high dimensionality of data should be considered in pattern recognition field. Face recognition databases are usually high dimensionality, especially when limited training samples are available for each subject. Traditional techniques perform dimensionality reduction are unable to solve this problem smoothly, which makes feature extraction task much difficult...
Stress and its related comorbid diseases are responsible for a large proportion of disability worldwide. In particular, chronic stress is the main responsible for the dramatic increase of premature mortality in the Western countries. However, advanced simulation and sensing technologies, such as virtual reality and mobile biosensors offer interesting opportunities for innovative personal health-care...
Paper deals with the problem of designing efficient classifiers for a special case of incremental concept drift. We focus on its classification based on the multiple classifier system. For the problem under consideration we propose four simple methods of combining classification and evaluate them via computer experiments.
Control of interior permanent magnet (IPMSM) is difficult because its nonlinearity and parameter uncertainty. In this paper, a fuzzy c-regression models clustering algorithm which is based on T-S fuzzy is used to model IPMSM with a series linear model and weight them by memberships. Lagrangian of constrained function is built for calculating clustering centers where training output data are considered...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.