Abstract
Classification is one of the most studied tasks in data mining and machine
learning areas and many works in the literature have been presented to solve
classification problems for multiple fields of knowledge such as medicine,
biology, security, and remote sensing. Since there is no single classifier that
achieves the best results for all kinds of applications, a good alternative is
to adopt classifier fusion strategies. A key point in the success of classifier
fusion approaches is the combination of diversity and accuracy among
classifiers belonging to an ensemble. With a large amount of classification
models available in the literature, one challenge is the choice of the most
suitable classifiers to compose the final classification system, which
generates the need of classifier selection strategies. We address this point by
proposing a framework for classifier selection and fusion based on a four-step
protocol called CIF-E (Classifiers, Initialization, Fitness function, and
Evolutionary algorithm). We implement and evaluate 24 varied ensemble
approaches following the proposed CIF-E protocol and we are able to find the
most accurate approach. A comparative analysis has also been performed among
the best approaches and many other baselines from the literature. The
experiments show that the proposed evolutionary approach based on Univariate
Marginal Distribution Algorithm (UMDA) can outperform the state-of-the-art
literature approaches in many well-known UCI datasets.