Abstract
Motivation Mass spectrometry imaging data typically contains tens of thousands of pixels, and m/z channels which may relate to biomolecules of interest. It is impossible to visualize such highly dimensional data, and many multi-variate analyses cannot be conducted without reducing dimensionality. dimensionality reduction algorithms are commonly used for data visualisation, feature selection and as part of data clustering workflows in examination of large Mass spectrometry imaging datasets. In this work, we seek to develop methods to determine the ability of dimensionality reduction algorithms to preserve local and global structure within reduced data.Results We have developed a novel evaluation method-Dimensional Cophenetic Integrity which measures the structure and pattern preservation of dimensionality reduction algorithms based on cophenetic distance of hierarchically clustered samples. We demonstrate that Dimensional Cophenetic Integrity results are indicative of expected tissue segmentation and image quality when compared to known synthetic data. Additionally, we find that optimum dimensionality reduction embeddings derive from hyperparameter selection far outside the typical range and show that Dimensional Cophenetic Integrity can be used as an objective criterion for Bayesian optimization. It is shown that optimization of dimensionality reduction preserve cluster relationships compared to default dimensionality reduction algorithm parameter decisions.