Abstract
Variable selection has gained significant attention as a means to enhance spectroscopic calibration performance. However, existing methods still have certain limitations. Firstly, the selection results are sensitive to the choice of training samples, indicating that the selected variables may not be truly relevant. Secondly, the number of the selected variables is still too large in some situations, and modelling with too many predictors may lead to over-fitting issues. To address these challenges, we propose and implement a novel multiple feature-spaces ensemble (MFE) strategy with the least absolute shrinkage and selection operator (LASSO) method.
The MFE strategy synergizes the advantages of LASSO regression and ensemble strategy, thereby facilitating a more robust identification of key variables. We demonstrated the efficacy of our approach through extensive experimentation on publicly available datasets. The results not only demonstrate enhanced consistency in variable selection but also manifest improved prediction performance compared to benchmark methods.
The MFE strategy provided a comprehensive framework for conducting variable importance analysis, leading to robust and consistent variable selection. Furthermore, the improved consistency in variable selection contributes to enhanced prediction performance for spectroscopic calibration, making it more robust and accurate.
[Display omitted]
•A multiple feature-space ensemble (MFE) method for consistency variable selection.•The importance of variable is determined in sample and variable spaces.•LASSO provide reliable regression coefficients for variable importance analysis.•The MFE method provides consistency-enhanced variable selection.•The proposed method yields impressive prediction improvement.