Abstract
In offline data-driven evolutionary optimization, no real fitness evaluations is allowed during the optimization, making it extremely challenging to build high-quality surrogates on limited amount of data. This is especially true for large-scale optimization problems where typically a large amount of data is needed for constructing reliable surrogate models. To overcome the data deficiency, semi-supervised learning is introduced to the offline data-driven evolutionary optimization process, where tri-training, a co-training variant, is used to update surrogate models. In the proposed algorithm, a tri-training algorithm selects candidate solutions with high-confidence fitness prediction to enrich the training data for surrogate models. The results on benchmark problems show that the proposed algorithm, compared with three most recent offline data-driven optimization algorithms, is competitive on the problems of up to 500 decision variables.