Author: Teng, Xinzhi
Title: Improving radiomic model reliability and generalizability using perturbations in head and neck carcinoma
Advisors: Cai, Jing (HTI)
Degree: Ph.D.
Year: 2023
Subject: Diagnostic imaging -- Data processing
Medical radiology
Medical radiology
Hong Kong Polytechnic University -- Dissertations
Department: Department of Health Technology and Informatics
Pages: 163 pages : color illustrations
Language: English
Abstract: Background: Radiomic models for clinical applications need to be reliable. However, the model reliability is conventionally established in prospective settings, requiring proposal and special design of a separate study. As prospective studies are rare, the reliability of most proposed models is unknown. Facilitating the assessment of radiomic model reliability during development would help to identify the most promising models for prospective studies.
Purpose: This thesis aims to propose a framework to build reliable radiomic models using perturbation method. The aim was separated to three studies: 1) develop a perturbation-based assessment method to quantitatively evaluate the reliability of radiomic models, 2) evaluate perturbation-based method against test-retest method for developing reliable radiomic model, and 3) evaluate radiomic model reliability and generalizability after removing low-reliable radiomics features.
Methods and Materials: Four publicly available head-and-neck carcinoma (HNC) datasets and one breast cancer dataset, in total of 1,641 patients, were retrospectively recruited from The Cancer Image Archive (TCIA). The computed tomography (CT) images, their gross tumor volume (GTV) segmentations, distant metastasis (DM) and local-/regional-recurrence (LR) after definitive treatment were collected from HNC datasets. Multi-parametric diffusion-weighted images (DWI), test-retest DWI scans, pathological complete response (pCR) were collected from breast cancer dataset. For the development of reliability assessment method for radiomic model, one dataset with DM outcome as clinical task was used to build the survival model. Sixty perturbed datasets were simulated by randomly translating, rotating, and adding noise to the original image and randomizing GTV segmentation. The perturbed features were subsequently extracted from the perturbed datasets. The radiomic survival model was developed for DM risk prediction, and its reliability was quantified with intra-class coefficient of correlation (ICC) to evaluate the model prediction consistency on perturbed features. In addition, the sensitivity analysis was performed to verify the variation between input feature reliability and output prediction reliability. Then, a new radiomic model to predict pCR with DWI-derived apparent diffusion coefficient (ADC) map was developed, and its reliability was quantified with ICC to quantify the model prediction consistency on perturbed image features and test-retest image features respectively. Following the establishment of perturbation-based model reliability assessment (ICC), the model reliability and generalizability after removing low-reliable features (ICC thresholds of 0, 0.75 and 0.95) was evaluated under a repeated stratified cross-validation with HNC datasets. The model reliability is evaluated with perturbation-based ICC and the model generalizability is evaluated by the average train-test area under the receiver operating characteristic curve (AUC) difference in cross-validation. The experiment was conducted on all four HNC datasets, two clinical outcomes and five classification algorithms.
Results: In development of model reliability assessment method, the reliability index ICC was used to quantify the model output consistency in features extracted from the perturbed images and segmentations. In a six-feature radiomic model, the concordance indexes (C-indexes) of the survival model were 0.742 and 0.769 for the training and testing cohorts, respectively. For the perturbed training and testing datasets, the respective mean C-indexes were 0.686 and 0.678. This yielded ICC values of 0.565 (0.518–0.615) and 0.596 (0.527–0.670) for the perturbed training and testing datasets, respectively. When only highly reliable features were used for radiomic modeling, the model’s ICC increased to 0.782 (0.759–0.815) and 0.825 (0.782–0.867) and its C-index decreased to 0.712 and 0.642 for the training and testing data, respectively. It shows our assessment method is sensitive to the reliability of the input. In the comparison experiment between perturbation-based and test-retest method, the perturbation method achieved radiomic model with comparable reliability (ICC: 0.90 vs. 0.91, P-value > 0.05) and classification performance (AUC: 0.76 vs. 0.77, P-value > 0.05) to test-retest method. For the model reliability and generalizability evaluation after removing low-reliable features, the average model reliability ICC showed significant improvements from 0.65 to 0.78 (ICC threshold 0 vs 0.75, P-value < 0.01) and 0.91 (ICC threshold 0 vs. 0.95, P-value < 0.01) under the increasing reliability thresholds. Additionally, model generalizability has increased substantially, as the mean train-test AUC difference was reduced from 0.21 to 0.18 (P-value < 0.01) and 0.12 (P-value < 0.01), and the testing AUCs were maintained at the same level (P-value > 0.05).
Conclusions: We proposed a perturbation-based framework to evaluate radiomic model reliability and to develop more reliable and generalizable radiomic model. The perturbation-based method is a practical alternative to test-retest scans in assessing radiomic model reliability. Our results also suggest the pre-screening of low-reliable radiomics features prior to modeling is a necessary step to improve final model reliability and generalizability to the unseen dataset.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
6993.pdfFor All Users3.89 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12547