Author: | Bai, Jiayuan |
Title: | Deep learning for cataract surgery phase recognition |
Degree: | M.Sc. |
Year: | 2023 |
Department: | Department of Electrical and Electronic Engineering |
Pages: | 58 pages : color illustrations |
Language: | English |
Abstract: | With the rapid growth of modern medical technology, computer-assisted surgery has entered a new revolutionary era and is evolving toward intelligence. The recognition of the surgical phase is an essential topic in computer-aided surgery since it helps to speed the development of surgical intelligence. Surgical phase recognition algorithms can offer clinicians accurate reports of surgical progress during surgery, allowing medical interns to evaluate their abilities by filming surgical videos. This research provides an effective strategy for achieving efficient and precise surgical process recognition for this important task, and its major work is to use a factorized encoder video vision transformer model[17] to recognize the surgical phase. In this method, the spatial information of surgical video is captured by the spatial transformer encoder, and the spatial information of the images of a sequence is fed into a temporal transformer encoder; after the temporal information within the sequence is extracted, the phase of a sequence is derived. Experiments show that combining spatial and temporal information is a benefit and the model we used achieves a better performance in our dataset than the vision transformer model and ResNet model with different layers. Since the surgical phase dataset used in this study is not balanced on phase sample numbers, we used focal loss to overcome the problem of sample imbalance. The alpha value in focal loss is designed to direct the loss function to focus on categories with phases with few samples. This approach did not improve the accuracy overall but achieved better results in smaller categories. We have also tried various data processing methods and found through comparative experiments that in the medical surgery phase recognition task, modifications to the orientation and colour of the image may have a bad impact on the model training. |
Rights: | All rights reserved |
Access: | restricted access |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
8270.pdf | For All Users (off-campus access for PolyU Staff & Students only) | 1.22 MB | Adobe PDF | View/Open |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/13867