Author: Mak, Po-kong
Title: Application of artificial intelligence in diagnosis coding : automatic selection of principal diagnosis
Degree: M.Sc.
Year: 1997
Subject: Diagnosis -- Data processing
Artificial intelligence
Hong Kong Polytechnic University -- Dissertations
Department: Multi-disciplinary Studies
Pages: viii, 103 leaves :bill. ; 30 cm
Language: English
Abstract: Coding of diagnosis is always a burden to medical and health information processing. It is a manpower consuming step in transforming the narrative diagnosis into numeric codes. This step is unavoidable since, for the current technology, computerised data processing and statistical analysis can only be made on categorised codes not on narrative data. Therefore, attempts are made by A.I. researchers in automating this tedious and manpower consuming process. Since the process involves understanding of narrative data, it is a rather complicated and challenging mission for the A.I. practitioners. Currently, the Center of Intelligent Information Retrieval at University of Massachusetts is actively involved in a project of automated assignment of ICD-9 codes to discharge summaries. The latest development of their work and findings were reviewed and summarised in this report. Selection of principal diagnosis from a group of coded diagnoses in a discharge summary is a major task in diagnosis coding. It requires professional understanding of the meaning and relationship of the diagnoses stated. Although, no successful automatic coding system has been rolled out at the present moment, some computer assisted systems have been built to facilitate the coding work by providing indexed lists of codes for selection. However, the selection of principal diagnosis from a list of disease diagnoses in the computer record still depends on clinician's or coder's input. An automatic principal diagnosis selection system is therefore proposed to be built as an add-on module which can determine the most attributable diagnosis based on the existing coded information in the patient record. It can be plugged-in to those existing computer assisted coding systems or, in the future, some automatic coding systems to handle the principal diagnosis selection process. The proposed system adopted the case based reasoning approach in artificial intelligence. The idea of Memory Organised Packet (MOP) [Schank, 1977] was referred. One year's coded records of the acute hospitals discharged during the period 01.09.95 to 31.08.96 stored in the Medical Record Abstraction System (MRAS) of Hospital Authority were used to form the case base of the expert knowledge in the selection of principal diagnosis since the principal diagnoses of these cases had already been assigned by professional coding staff. In the case base, the relative frequency of a code A existed to be a secondary code or a procedure code for a particular principal diagnosis B was treated as the relative supporting strength (a relevance feedback) of this code A to support that particular code B as principal diagnosis. This is then summarised into MOPs of supporting secondary diagnoses, procedures and their corresponding supporting strength to a particular code for it to be the principal diagnosis among the other candidates in the coded records. The MOPs were represented in form of lists. The noise in MOP arisen from those rare cases was eliminated by determining the convex point of the ordered cumulative frequency curve of the list. The code at this point was treated as a threshold code and those supporting codes in the list beyond this threshold were deleted from the list. The transformation of individual cases into the knowledge representation, the supporting lists, was handled by SAS, a statistical analysis system which has the ability to efficiently handle those statistical analyses of a large volume of data. The inference engine and the user interface was developed by using Visual Basic, an object based programming language, to facilitate future development and allow modification, if necessary, when plugged-in to other coding systems. The coded discharged records from 01.09.96 to 28.02.97 in MRAS were used as test data for the system. Cases were grouped into medicine (MED), orthopaedics (ORT), paediatrics (PAE) and surgery (SUR) by the specialties which they were firstly transferred. It was found that the accuracy of the system for all three groups decreased as the number of co-existed diagnosis codes increased. Among the four groups of data, the system exhibited the highest accuracy for ORT cases (87% for two, 69% for three and 53% for four co-existed diagnosis codes) while PAE is the lowest (58%, 48% and 36% respectively). The accuracy of the system in handling orthopaedic cases was found to be the highest. This may be due to the reason that most of the orthopaedic cases were quite specific especially to the location of trauma. Besides, they usually had procedures done which is quite unique to the principal diagnosis. Although the accuracy for MED and PAE cases was not high, the MOPs for such cases provided a good basis for further study of the co-morbid and complication conditions.
Rights: All rights reserved
Access: restricted access

Files in This Item:
File Description SizeFormat 
b14036423.pdfFor All Users (off-campus access for PolyU Staff & Students only)3.31 MBAdobe PDFView/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: