Non-stationary data can be characterized as data having a distribution that changes over time. It is well-known that most successful machine learning algorithms are based on stationary data i.e., data that are assumed to have a fixed distribution (although unknown, in most cases). Non-stationary classification problems require the induced classifiers to be flexible enough to learn or adapt themselves to reflect the changes on data distribution over time; this can be a hard task, taking into account that changes that may happen are not usually known in advance. Although there are several proposals in the literature that deal with non-stationary data, none of them deal with missing attribute values, a common problem in real applications. This paper proposes an ensemble of classifiers for non-stationary environments that (1) uses a new graph structure for representing data known as Complete P-partite Attribute-based Decision Graph — CPp-AbDG; (2) handles data described by heterogeneous attributes (numeric and categorical) and (3) handles missing attribute values. Experiments in non-stationary environments show evidence of the strength of the CPp-AbDG representation as well as the potentiality of the proposed ensemble approach.
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.