Abstract
Clustering, as an unsupervised technique, does not explain why certain objects are grouped together, which makes it challenging for decision-makers to interpret the characteristics that define each cluster. In this study, we propose a weight space clustering method based on Data Envelopment Analysis (DEA). This method offers a semantic explanation for each cluster, and it uses Monte Carlo simulation to explore feasible weight vectors that satisfy DEA constraints, thereby effectively characterising the weight space of the objects. The similarity between objects is then computed to construct a similarity matrix, which serves as the foundation for clustering. To validate the weight space clustering method, we use a Monte Carlo simulation with a piecewise-linear production function. Our results show that the average Rand index and F-score of the clustering exceed 0.9, demonstrating the effectiveness of the proposed method. Furthermore, comparative experiments show that the internal validity of our clustering results surpasses that of the existing DEA-based clustering methods, as well as conventional k-means and hierarchical clustering approaches. Finally, we apply the proposed method to industrial parks in Hunan Province, China, identifying three distinct clusters: the first is characterised by high water resource consumption, the second also emphasises water consumption but places greater importance on land output and the third is primarily associated with energy consumption. These explainable clustering results provide valuable insights into enhancing production efficiency and optimising resource utilisation in the studied industrial parks.