ABC-DynF

Adaptive Bayesian Classifier with Dynamic Features

Welcome to the ABC-DynF web site

The ABC-DynF (Adaptive Bayesian Classifier with Dynamic Features) learning framework is an adaptive framework for supervised learning. It is based on the AdPreqFr4SL project, from which it inherits simple control strategies for cost-performance management and concept drift handling.

ABC-DynF is designed to allow the use of a dynamic feature space. In a text document stream, new attributes appear or fall out of use over time, and the relative importance of attributes is subject to evolution. It is thus necessary to carry out a dynamic feature selection: the set of used attributes has to be dynamic.

Taking this requirement into account, ABC-DynF incorporates a mechanism for tracking the importance of each attribute, based on the chi squared weighting function. The list with the most relevant attributes is continuously updated for classification purposes. ABC-DynF makes use of Bayesian classifiers, since they allow to use arbitrary attribute sets.

ABC-DynF has been tested in the problem of email foldering. The experimentation files can be downloaded from here.

Main features

Dynamic feature space support

  • The importance of each classification feature is monitorized over time
  • A list of the most important top-k attributes is continuously updated
  • The base classifier uses only attributes from this list to classify new instances
  • The incremental nature of the base Bayesian algorithms allow new features and classes to be included in the model

Cost-performance management

  • Simple control strategies based on the observation of some performance indicators to decide when to increase the k value and to start searching for new attribute dependences
  • This bias control leads to the selection of the optimal class-model for the current training data, avoiding the problems of underfitting or overfitting
  • Since updating the structure is a costly task, we reduce the cost of updating by first adapting parameters
  • The structure is dapted only at sparse time points, when there is some accumulated data and there is evidence that the use of the current structure no longer guarantees the desirable improvement in the performance

Contact

Please use the following email addresses:

jmcarmona

gladys