| Author: | Hou, Wenjun |
| Title: | Extracting and incorporating clinical information for radiology report generation |
| Advisors: | Li, Wenjie Maggie (COMP) |
| Degree: | Ph.D. |
| Year: | 2025 |
| Department: | Department of Computing |
| Pages: | xviii, 180 pages : color illustrations |
| Language: | English |
| Abstract: | Automated interpretation of medical images is essential in modern healthcare, particularly with the daily growing volume of medical imaging data. Among various imaging types, chest X-ray (CXR) is one of the most widely used modalities, and a key application of this interpretation is radiology report generation (RRG), which aims to produce free-text descriptions of relevant findings in CXR images. These findings may include anatomical structures, pathological conditions, or other significant observations. However, analyzing CXR images requires highly specialized domain knowledge to understand and interpret both the visual content and the clinical context of a medical case. Writing radiology reports can be time-consuming, often requiring considerable effort from radiologists, even for experienced professionals. Consequently, automating RRG has garnered significant interest from the research community due to its potential to alleviate radiologists' workload and expedite the diagnostic process. Existing RRG approaches typically process a CXR as input and employ an auto-regressive decoding strategy to generate reports sequentially from left to right. However, these methods often exhibit limited clinical accuracy, as they fail to adequately exploit and incorporate relevant clinical information, such as observations, disease progression, or relevant attributes. It is essential to properly extract and integrate diverse information sources, thereby enhancing the quality and utility of automated radiology reports. In this thesis, we aim to extract and incorporate clinical information for radiology report generation, where different sources of information are effectively utilized to improve the accuracy of generated reports. In particular, we identify three main research problems: (1) How to improve the disease/observation accuracy of generated reports given CXR images, especially when (large) language models can produce highly readable and coherent clinical texts? (2) How to properly model the attributes of diseases/observations that reflect both spatial characteristics and temporal progression, given sequential CXRs? (3) How to regulate a radiology report generation model to produce consistent reports at the attribute-level when semantically equivalent radiological studies are provided as input? Based on the categories of work carried out, this thesis is structured into three parts. The first part of our work (Works 1 and 2) focuses on improving observation accuracy (problem 1). Observations represent high-level clinical information of CXRs, and enhancing this accuracy requires effective visual understanding and domain knowledge. To achieve this, we first construct an observation-specific graph from radiology reports, including three levels of nodes: observations, n-grams, and tokens. We then propose an observation-guided approach, ORGAN, which first extracts observations from CXRs and then selects relevant information from the graph to enhance report generation. Building upon this, we further enhance clinical accuracy by leveraging large language models (LLMs), given their strong capabilities across various domains. However, LLMs still exhibit knowledge gaps when analyzing CXR studies, particularly complex cases. To address this, we introduce Radar, a method that first assesses and refines the knowledge already acquired by LLMs based on extracted observations, and then injects supplementary knowledge to complement the learned information. Extensive experiments demonstrate that our proposed methods significantly improve observation-level accuracy in radiology report generation. The second part of our work (Work 3) addresses problem 2, which involves both incorporating prior study information and effectively integrating relevant attributes to generate spatiotemporally precise reports. To achieve this, we categorize attributes from sequential radiology reports into two types: spatial and temporal. Since these attributes are closely linked to observations and disease progression, we construct a progression graph and propose a framework called RECAP. RECAP leverages prior CXR studies as additional input and reasons over the progression graph to accurately select relevant attributes, thereby enhancing radiology report generation. Extensive experiments demonstrate that our framework outperforms existing baselines in attribute modeling, highlighting its effectiveness in improving radiology report generation. In the third part (Work 4), we address problem 3 by introducing two metrics to quantify inter-report consistency and developing a lesion-aware mixup for consistent radiology report generation. Building on extracted observation- and progression-aware attributes, we propose a framework called ICON, which models such consistency using regional information from CXRs. Given an X-ray, our approach extracts lesions and retrieves similar cases for mixup. The model is then trained to align shared representations of mixed lesions with relevant attributes, enabling ICON to effectively enhance inter-report consistency. Extensive experiments validate the effectiveness of our framework, demonstrating its ability to improve consistency in radiology report generation. In summary, this thesis presents a comprehensive study of radiology report generation, advancing factual accuracy through the integration of clinical information. Our findings demonstrate the effectiveness of the proposed approaches, highlighting their significant potential to enhance medical image interpretation and support real-world diagnostic workflows. |
| Rights: | All rights reserved |
| Access: | open access |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/14193

