Abstract
The DAIL (Dietetic Assessment and Intervention in Lung Cancer) study investigated the need for dietetic input in patients with Non-Small Cell Lung Cancer (NSCLC). It based need to see a dietician on the PG-SGA (Patient Generated Subjective Global Assessment), as the gold standard test. This abstract reports on a sub-study aimed at identifying if machine learning could be used to predict the need to see a dietitian using alternative data points collected during the study, when compared to the PG-SGA.
Methods
96 patients with stage 3b and 4 lung cancer were recruited between April 2017 and June 2019. Of these 20 had incomplete data, leaving 76 patients; 56 from Royal Surrey County Hospital (RSH) and 20 from Frimley Park Hospital (FPH). The PG-SGA was completed in all cases. This was compared to data points collected from the study, which included: the G8 frailty assessment, EORTC QLQ C30 and LC13 quality of life assessments, hand grip strength, psoas muscle surface area, spirometry, routine blood tests, Body Mass Index (BMI) and weight change, leading to 137 data points for each patient. Univariate analysis was used to find the strongest single correlates with “need to see a dietitian” (NTSD) and “critical need to see a dietitian” (CNTSD). The correlates with a Spearman correlation above +/-0.4 were selected to train a Support Vector Machine (SVM) to predict NTSD and CNTSD (SVM1) and the misclassification error calculated.
Results
The number of measures with Spearman correlation coefficients above +/-0.4 was 18 and 13 out of a total of 137 for NTSD and CNTSD respectively. SVMs trained with these measures produced 3% and 7% misclassification error. For the SVM trained on the RSH data and tested on the FPH data the results were weaker with errors of 20% or more. This is likely to be due to the fact that only 20 patients were included in the FPH data set.
Conclusion
This work suggests that machine learning can be used to predict the need to see a dietician for lung cancer patients. The results are promising, producing low misclassification rates. It could potentially automate screening for need to see a dietitian. However the results for FPH data using a model trained on RSH data suggest more work is needed to transfer the model between datasets from different hospitals.