Abstract
The search for the optimum in a mixed continuous-combinatorial space is a challenging task since it requires operators that handle both natures of the search domain. Instance reduction (IR), an important pre-processing technique in data science, is often performed in separated stages, combining instance selection (IS) first, and subsequently instance generation (IG). This paper investigates a fast optimisation approach for IR considering the two stages at once. This approach, namely Accelerated Pattern Search with Variable Solution Size (APS-VSS), is characterised by a variable solution size, an accelerated objective function computation, and a single-point memetic structure designed for IG.
APS-VSS is composed of a global search crossover and three local searches (LS). The global operator prevents premature convergence to local optima, whilst the three LS algorithms optimise the reduced set (RS). Furthermore, by using the k-nearest neighbours algorithm as a base classifier, APS-VSS exploits the search logic of the LS to accelerate, by orders of magnitude, objective function computation. The experiments show that APS-VSS outperforms existing algorithms using the single-point structure, and is statistically as competitive as state-of-the-art IR techniques regarding accuracy and reduction rates, while reducing significantly the runtime.