Document Type : Original Paper
Department of Physics, FMIPA, Universitas Indonesia
MRCCC Semanggi Hospital, Jakarta, Indonesia
Introduction: Machine learning models have been widely used to predict dose distribution in therapy planning, especially in advantageous techniques such as Intensity Modulated Radiation Therapy (IMRT). One of the machine learning models that can be used to predict regression is random-forest, which can reduce output bias by using the average value of all estimators used so that small bias data will not significantly affect the final result.
Method: Planning data in DICOM format (the original data format) is exported to CVS (Comma Separated Values). Then the data is divided into training and testing data, which are selected randomly. The algorithm used to predict is a random-forest that will be trained using 7-fold validation and then the model will be evaluated with new data, i.e., data that the model has never seen before. The data evaluated were the parameters to obtain HI (Homogenety Index) for the target organ, whereas the mean and max dose for organs at risk (OARs) were evaluated. Statistical tests were also carried out to assess the significant difference between the predicted value and the true value.
Results: Random-forest was able to predict the true value with errors evaluated using MAE on PTV features D2 (0.012), D50 (0.015) and D98 (0.018) as well as at OAR feature (Dmean and Dmax) of the right lung (0.104 and 0.228), left lung (0.094 and 0.27), heart (0.088 and 0.267), spinal cord (0.069 and 0.121) and(V95) Body (0.094) . Based on the results of statistical tests, there is no significant difference between the two data.
Conclusion: random-forest regressor is able to predict the dose value with the smallest difference in PTV features.