The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Detecting impending failure of hard disks is an important prediction task which might help computer systems to prevent loss of data and performance degradation. Currently most of the hard drive vendors support self-monitoring, analysis and reporting technology (SMART) which are often considered unreliable for such tasks. The problem of finding alternatives to SMART for predicting disk failure is an...
Graph propositionalization methods transform structured and relational data into a fixed-length feature vector format that can be used by standard machine learning methods. However, the choice of propositionalization method may have a significant impact on the performance of the resulting classifier. Six different propositionalization methods are evaluated when used in conjunction with random forests...
If a system can represent knowledge symbolically, and ground those symbols in an environment, then it has access to a vast range of data from that environment. The system described in this paper acts in a simple virtual world. It is implemented solely in fatiguing Leaky Integrate and Fire neurons; views the environment; processes natural language commands; plans; and acts. Visual representations are...
The following topics are dealt with: machine learning; image processing; multimedia; feature extraction and selection; neural networks; evolutionary algorithms and genetic programming; statistical learning; supervised learning; functional clustering; text mining; support vector machines; reinforcement learning; cancer and radiation therapy; system security; bioinformatics and computational biology;...
We present a novel method for fusing the decisions of multiple classification algorithms which use different features, classification methods, and data sources. The proposed method, called context dependent fusion of multiple algorithms (CDF-MA) is motivated by the fact that the relative performance of different algorithms can vary significantly as the characteristics of the input data vary. The training...
We investigate parameter-based and distribution-based approaches to regularizing the generative, similarity-based classifier called local similarity discriminant analysis classifier (local SDA). We argue that regularizing distributions rather than parameters can both increase the model flexibility and decrease estimation variance while retaining the conceptual underpinnings of the local SDA classifier...
In this paper, we propose a new general low-level feature representation for audio signals. Our approach, called Dominant Audio Descriptor is inspired by the MPEG-7 Dominant Color Descriptor. It is based on clustering time-local features and identifying dominant components. The features used to illustrate this approach are the well-known Mel Frequency Cepstral Coefficients. The performance of the...
Many real world problems which can be assigned to the machine learning domain are inverse problems. The available data is often noisy and may contain outliers, which requires the application of global optimization. Evolutionary Algorithms (EA's) are one class of possible global optimization methods for solving such problems. Within population based EA's, Differential Evolution (DE) is a widely used...
In this paper, we study the learning impact of data sampling followed by attribute selection on the classification models built with binary class imbalanced data within the scenario of software quality engineering. We use a wrapper-based attribute ranking technique to select a subset of attributes, and the random undersampling technique (RUS) on the majority class to alleviate the negative effects...
We propose a new algorithm for sequence segmentation based on recent advances in semi-parametric sequence clustering. This approach implies the use of model-based distance measures between sequences, as well as a variant of spectral clustering specially tailored for segmentation. The method is highly flexible since it allows for the use of any probabilistic generative model for the individual segments...
In this paper, we present new probabilistic models for identifying bird species from audio recordings. We introduce the independent syllable model and consider two ways of aggregating frame level features within a syllable. We characterize each syllable as a probability distribution of its frame level features. The independent frame independent syllable (IFIS) model allows us to distinguish syllables...
Reinforcement learning suffers scalability problems due to the state space explosion and the temporal credit assignment problem. Knowledge-based approaches have received a significant attention in the area. Reward shaping is a particular approach to incorporate domain knowledge into reinforcement learning. Theoretical and empirical analysis of this paper reveals important properties of this principle,...
Microarray datasets comprise a large number of gene expression values and a relatively small number of samples. Feature selection algorithms are very useful in these situations in order to find a compact subset of informative features. We propose a redundancy control method for algorithms in the recently proposed SPEC family of spectral-based feature selection algorithms. This method is applied to...
In this work we apply a new technique called conformal prediction to the Functional Clustering of Gene Expression Profiles in Human Cancers Challenge. The method not only allows us to make predictions but also include measures of accuracy and reliability of the prediction. These measures are provably valid under i. i. d. assumption. Using this approach it becomes possible to control the number of...
Accurate classification of caller interactions within Interactive Voice Response systems would assist corporations to determine caller behavior within these telephony applications. This paper details the development of such a classification system for a pay beneficiary application. Fuzzy Inference Systems, Multi-Layer Perceptron, Support Vector Machine and ensemble of classifiers were developed. Accuracy,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.