From the Pathology and Laboratory Medicine Service,
Veterans Affairs Maryland Health Care System, Baltimore, Maryland [1];
Department of Pathology, University of Maryland Medical System,
Baltimore, Maryland [2]; and
Department of Pathology, The Johns Hopkins Medical Institutions,
Baltimore, Maryland [3].
Send comments and correspondence to:
George.Moore4@va.gov
Moore GW, Berman JJ.
Performance analysis of manual and automated
systematized nomenclature of medicine (SNOMED) coding.
Am J Clin Pathol. 1994 Mar;101(3):253-256.
PMID: 8135178.
PubMed Entry
Full Text of Article:
http://www.netautopsy.org/autocode.htm
Related Publications:
http://www.netautopsy.org/apdmchap.htm
Anatomic Pathology Data Mining.
http://www.netautopsy.org/ascpedge.htm
Automated Edge Detection for Pathology Images....
http://www.netautopsy.org/ascpfrac.htm
Fractal Dimension for Pathology.
http://www.netautopsy.org/ascpisap.htm
Image Segmentation and Analysis.
http://www.netautopsy.org/autocode.htm
Automated SNOMED Coding.
http://www.netautopsy.org/basalcel.htm
Basal Cell Carcinoma: Histologic Discontinuities....
http://www.netautopsy.org/camyxoma.htm
DNA Analysis of Cardiac Myxomas....
http://www.netautopsy.org/celdeath.htm
Cell Death in Preneoplasia....
http://www.netautopsy.org/clearcel.htm
Clear Cell Dysplasia of Bladder....
http://www.netautopsy.org/clearrev.htm
Clear Cell Dysplasia of Bladder....
http://www.netautopsy.org/confiden.htm
Maintaining Patient Confidentiality....
http://www.netautopsy.org/elevpsal.htm
Elevated Serum Prostatic Specific Antigen....
http://www.netautopsy.org/embrbibl.htm
Bibliography: Staged Human Embryos....
http://www.netautopsy.org/init.htm
Role of Cell Death in Preneoplastic Lesions....
http://www.netautopsy.org/isapwlcm.htm
Image Segmentation and Analysis in Pathology (ISAP)
http://www.netautopsy.org
Bibliography: The Johns Hopkins Autopsy Resource. Over 1300 References....
http://www.netautopsy.org
/jharaurw.htm Autopsy Report Words: The Johns Hopkins Autopsy Resource.
http://www.netautopsy.org
/jharzipf.htm Zipf Distribution: The Johns Hopkins Autopsy Resource.
http://www.netautopsy.org/kerato.htm
DNA Flow Cytometry in Keratoacanthoma....
http://www.netautopsy.org/keratflw.htm
DNA Flow Cytometry in Keratoacanthoma....
http://www.netautopsy.org/livelet.htm
Dysplasia in Atypical Liver Nodules....
http://www.netautopsy.org/monoclon.htm
Cell Growth Simulation Predicting Polyclonal Origins....
http://www.netautopsy.org/monocalt.htm
Cell Growth Simulation Predicting Polyclonal Origins....
http://www.netautopsy.org/mp95-307.htm
http://www.netautopsy.org/myx.htm
http://www.netautopsy.org/natlngpr.htm
http://www.netautopsy.org/snomedsp.htm
http://www.netautopsy.org/pracguid.htm
Practice Guide for Autopsy Pathology....
http://www.netautopsy.org/pren.htm
http://www.netautopsy.org/protoiad.htm
http://www.netautopsy.org/qimpmopa.htm
http://www.netautopsy.org/rvcognis.htm
http://www.netautopsy.org/rvflatte.htm
http://www.netautopsy.org/rvgodell.htm
http://www.netautopsy.org/rvneuroc.htm
http://www.netautopsy.org/spont.htm
http://www.netautopsy.org/sponregr.htm
http://www.netautopsy.org/unfunded.htm
http://www.netautopsy.org/uniqmddm.htm
http://www.netautopsy.org/vhpsapx.htm
http://www.netautopsy.org/zmopapsa.htm
1. DISCLAIMER.
DISCLAIMER. United States Government Work, uncopyrighted,
public-domain, DRAFT COPY ONLY. This document does not necessarily
represent the views or policies of any United States Government agency.
This document is provided "as is", without warranty of any kind, express
or implied, including but not limited to the warranties of merchantability,
fitness for a particular purpose and non-infringement. In no event shall the
authors be liable for any claim, damages or other liability, whether in an
action of contract, tort or otherwise, arising from, out of, or in connection
with the document or the use or other dealings made with the document.
ABSTRACT.
Many pathology departments rely on the accuracy of computer-generated
diagnostic coding for surgical specimens. At present, there are no published
guidelines for assuring the quality of coding devices. To assess the
performance of SNOMED coding software, manual coding was compared with
automated coding in 9,353 consecutive surgical pathology reports at the
Baltimore VA Medical Center. Manual SNOMED coding produced 13,454 diagnostic
entries comprising 519 distinct diagnostic entities; 209 were unique
diagnoses (assigned to only one of the 9,353 reports). Automated coding
obtained 23,744 diagnostic entries comprising 498 distinct diagnostic
entities, of which 129 were unique diagnoses. There were only 44 instances
(0.5%) where automated coding missed key diagnoses on surgical case reports.
In summary, automated coding compared favorably with manual coding.
To achieve the maximum performance from software coding applications,
departments should monitor the output from automatic coders. Modifications
in reporting style, code dictionaries, and coding algorithms can lead
to improved coding performance.
key words: SNOMED, MUMPS, quality assurance,
translation, software, pathology, code
INTRODUCTION.
Coding pathology reports has become an important activity for laboratories
of anatomic pathology. Once regarded solely as a means for research-oriented
pathologists to recover interesting cases, diagnostic coding has become
a means of linking pathology services with other hospital services rendered
on a patient and billed to third parties. Inaccurate diagnostic coding may
cause a report to be uncountable, irretrievable, or unreimbursable.
Coded reports permit pathologists to complete diagnosis-specific quality
assurance activities, and compile statistical data on the types of specimens
received in the department. In the future, coded databases, stripped
of patient identifiers and collected from many contributing health care
services, may assist epidemiologists in tracking the spread of diseases,
identifying areas of special risk, and providing reliable quantitative
information for developing national health care policies.
The difficulties encountered in coding have received scant attention.
[1,
2]
The College of Anatomic Pathologists, copyright-holder for SNOMED
(Systematized Nomenclature of Medicine), does not address the problems
of who should code, how much time is needed to code, how often coding errors
may occur, and how to cope with coding errors.
[3,
4]
To our knowledge,
there are only a few reports in the literature that address the problem
of manual coding inaccuracies. Hall and Lemoine,
[5]
in one of the few such
studies, found errors in more than 10% of cases. They divided manual coding
errors into five types:
(1) Factually correct but unhelpful codes
(e.g., coding all benign lesions as `negative for tumor');
(2) Inconsistent codes (coding `dysplasia'
on Monday and `atypia' on Tuesday);
(3) Idiosyncratic codes (using a mnemonic for a lesion,
often inscrutable to other people);
(4) Entry errors (e.g., entering `lipoma'
when one intends to enter `lymphoma');
(5) Incomplete coding due to impatience or laziness.
Who or what should code? Certainly, coding by a clerk
saves the pathologist's time, but does it accomplish the job adequately?
Some hospitals employ
professional coders trained to list diagnoses in a manner that supports
linkage to reimbursable diagnosis related groups (DRGs). Professional coders
may generate revenue for the hospital, but they command a high salary, and
they may not code lesions in a manner that allows the pathologist to retrieve
specimens of academic or clinical interest. In England, the Korner committee
recommended that the National Health Service's reliance on coding by lay
personnel should be abandoned, and that physicians do their own coding.
[6,
7]
No one is more familiar with a report than the pathologist who signs it.
The question remains, can pathologists be expected to thoroughly code all
their reports on a daily basis?
Considering the problems with human coding, the incentives for accurate
automated diagnostic coding are obvious, and a variety of software systems
that perform automated coding (`autocoders') are commercially available.
In science, business, and many areas of medicine, the public has come to
accept computer-generated results as reliable, often more reliable than
results generated by humans. In the field of medicine, small errors
in the way that computers handle data can result in catastrophe.
[8]
This is particularly true in areas that depend heavily on contextual
interpretation of language, such as diagnostic coding.
Many pathology departments do not wish to become entangled in the problem
of validating the software they purchase. When hundreds of thousands
of dollars are spent on a laboratory information system, the departments
expect product validation to be completed by the vendor and approved
by the Food and Drug Administration (FDA), the government agency responsible
for medical devices. Unfortunately, the FDA, under the Safe Medical Devices
Act of 1990, has shifted much of its oversight activities from the software
vendor to the software buyer (i.e., from premarket approval to postmarket
surveillance).
[9]
Health care facilities using software devices are required
to report product defects to the FDA, which can rapidly suspend approval
of devices that went to market with minimal agency oversight.
[10]
Is it realistic to expect commercial vendors to perform any quality assurance
on their automated coders, other than to assure that the autocoders yield
coded diagnoses without causing system crashes, and that the diagnoses
should be retrievable by code number or by diagnostic terms that match code
numbers? The software vendor cannot really test whether the autocoder
is operating accurately at any given institution, because reports
at that institution may be written in an idiosyncratic manner that makes
reliable coding impossible. As an example, some pathologists may wish
to abbreviate diagnoses in their report. The autocoder would not necessarily
provide a code for CLL, TCC, BCC, etc., unless it has a dictionary of the
abbreviations commonly used in that department. Since abbreviations are not
included in the SNOMED dictionary, automatic coders would perform poorly
in departments that use abbreviated diagnoses. To correct that problem,
the abbreviation would have to be added to the electronic dictionary
that links diagnostic terms with SNOMED codes. Similar problems might arise
in departments where the reports are not scrutinized carefully for spelling
errors, or that use grammatically challenging sentence structures.
Consider the problems faced by a computer program in coding the following
sentence: `Neither metastatic squamous cell carcinoma nor primary
infiltrative processes can be ruled out, as well as the seborrheic keratosis,
which is present.'
It is in the interest of every department that uses an autocoder to evaluate
performance on their own reports, and to devise a program to enhance
performance by expanding the diagnostic dictionary, or by changing the
standard word, phrase, or sentence format (syntax) of their reports.
In addition, departments should have a way of determining whether the changes
they make actually improve the autocoder's performance. In the present
study, we compare the results of automated coding with the results
of coding performed by anatomic pathologists at the Baltimore VA
Medical Center. Based on these results, we recommend guidelines
for writing reports and enhancing the content of the code dictionaries
to improve performance of automatic coding software.
MATERIAL AND METHODS.
Materials. All surgical pathology reports accessioned consecutively
between October 1, 1989, and June 30, 1992, at the Baltimore VA
Medical Center were examined.
Manual Coding. Manual coding was performed by three board-certified
anatomic pathologists at the Baltimore VA Medical Center.
These pathologists were acquainted with the SNOMED system,
including the categorization of code information into the six fields
of topography, morphology, etiology, function, procedure, and disease.
A seventh field, `occupation', is not included in the VA SNOMED package.
Manual coding was performed with the assistance of an on-line dictionary
of SNOMED codes licensed to the Department of Veterans Affairs,
and included in the standard VA anatomic pathology information system
package, version 4.1. On a daily basis, during the computer session
in which the pathologist electronically signs, or `releases' reports
for general hospital access, the pathologist enters terms into the various
SNOMED fields. Although all six SNOMED fields are accessible
to the pathologist, only topography and morphology fields are default
selected by the computer system, and the pathologist must request special
access to the fields for etiology, function, procedure, and disease,
through a cumbersome user interface. Nearly all cases signed out
in our department have only topography and morphology codes.
When the pathologist enters a term at the prompt, the computer selects
a match and displays the match term and its corresponding SNOMED code.
The pathologist is given an opportunity to delete the SNOMED code,
if desired.
Hardware. The computer used for the present study was
an IBM PC/AT-compatible computer
(COMTEX, 30368 microprocessor, 25MHz, 330 Mb Priam hard disk),
programmed with American National Standard MUMPS (MGlobal, Inc.,
Houston, TX), and the public-domain File Manager (FileMan) database
management system of the United States Department of Veterans Affairs,
[11]
used routinely in 169 VA medical centers.
Input Data. Reports were obtained as a raw global ASCII file
downloaded from the mainframe computer at the Baltimore VA Medical Center,
and containing the complete text of all consecutive surgical pathology
reports obtained between October 1, 1989, and June 30, 1992. The entire
contents of each report, including patient demographics, date and time
of accessioning and signout, specimen source, gross description,
final microscopic diagnosis, pathologist's identification,
and manually-entered SNOMED codes, were passed into the ASCII file,
a total of 21,168,261 bytes. The full text of the `specimen source'
and `final microscopic diagnosis' for each case served as source text
for the SNOMED autocoder. All numbers and punctuation marks were removed
from the source-text-stream, as well as all letter-strings shorter than
3 letters, except for: `no', `os' (=`bone' or `left eye'), `od'
(=`right eye'), `eg' (=`esophago-gastric'), and `ge' (=`gastro-esophageal').
Software. Automated coding of free-text diagnoses
into SNOMED codes was performed on TRANSOFT, a table-driven
public-domain computer translation shell, written in MUMPS or HyperPAD.
[12,
13]
The MUMPS-version of TRANSOFT,
used in the present investigation, employs the file structure of FileMan.
The key elements of TRANSOFT, including algorithms, parsing rules,
general applications, and specific application as an automatic SNOMED coder,
have been discussed elsewhere.
[12,
13]
Topography and morphology codes (SNOMED dictionary) were downloaded
from the VA-licensed subset of SNOMED into an external file serving
as TRANSOFT's dictionary. For each SNOMED-code in the VA subset,
there is a main term and any number of synonyms. For example, the topography
code `TX1000' has `CEREBROSPINAL FLUID' as its main term and `SPINAL FLUID',
`CSF', and `FLUID, SPINAL' as synonyms. Two sentence-parsing models
were used: simple coding and enhanced coding.
Simple Coding Model. In the simple coding model, a single word
in the source-text-stream finds a SNOMED-match if the word is present
among the words of the main term or synonyms for a that SNOMED-code
in the VA-subset. In case of multiple matches, a single match
is selected arbitrarily. For example, `cerebrospinal' in the
source-text-stream has exactly one match,
namely `TX1000 CEREBROSPINAL FLUID'. Two consecutive words
in the source-text-stream find a SNOMED-match if both words are present
among the words of the main term or synonyms for a particular SNOMED-code.
A two-word match always supersedes a one-word match. Three-word,
four-word,... matches are attempted, with a longer word-match always
superseding a shorter word-match. Thus in the simple coding model,
the consecutive words `cerebrospinal fluid' in the source text stream
would obtain a unique match to the topography code, TX1000.
Enhanced Coding Model. The disadvantage of the
simple coding model is its inability to capture the local language
usage for a particular group of pathologists. For example,
the topography code for `PERITONEAL FLUID'
has only `ASCITIC FLUID' and `FLUID, ASCITIC' as synonyms in the VA-licensed
subset of SNOMED, whereas pathologists in our department are as likely
to use the term `ASCITES FLUID' in our free text. There is no occurrence
of the word `ASCITES' in the VA-subset of SNOMED, so that such a case
would fail to be coded by the simple coding model. In the enhanced coding
model, we obtained a list of all the diagnostic terms used in our department
over the 33-month period of study. This is accomplished by creating a list
of all one-word, two-word, three-word,... terms bounded on either side
by punctuation marks, numerals, or `barrier words' (i.e., prepositions,
conjunctions, articles, etc.).[14] These diagnostic terms are then pointed
to one or more appropriate SNOMED codes. For example, the diagnostic term,
`basal cell carcinoma' points both to M80903 (=`BASAL CELL CARCINOMA')
and to T01000 (=`SKIN'). As will be shown below (RESULTS and DISCUSSION),
a more sophisticated parsing model than this phrase-match model does not
appear to be warranted.
False-negative and False-positive rates.
A `false-negative case'
is one to which a correct code for a major diagnosis has not been assigned.
A `false-positive case' is one to which an incorrect code
for a major diagnosis has been assigned. The `false-negative rate'
is the proportion of false-negative cases among all cases.
The `false-positive rate' is the proportion of false-positive cases among
all cases. In principle, false-negative and false-positive rates
may be obtained both for manual coding as well as for the various methods
of autocoding. Unfortunately, obtaining these rates requires that each case
be examined by a human coding expert, and the correct codes determined
for that case. From this set of `true positive' codes, a computer program
can determine whether a particular case has been correctly assigned
by manual or various automated methods. Most pathology laboratories
cannot devote the human resources necessary to determine the exact set
of true-positive codes for their caseloads.
For retrieval problems, the most important information
is the false-negative rate for the autocoder. This is the proportion
of cases in which the autocoder fails to assign a correct code needed for
retrieval. If the autocoder has, say, a 10% false-negative rate,
this means that, on average, 10% of cases desired in a particular retrieval
request will not be recovered. The false-positive rate,
namely the proportion of unwanted cases that will be recovered,
can be regarded as a nuisance-factor, which only becomes important
if it is very large. For example, when one performs a MEDLINE literature
search, one typically detects numerous unwanted citations; but these can
easily be bypassed at a glance. The desired citations which are not detected
(false-negatives) is the more vexing aspect of a literature search.
For the present investigation, we assumed initially that the manual coding
for each case contained no false-negatives for major diagnoses. That is,
we assumed that the major sense of the case was always captured manually.
We then reviewed every case in which a major diagnosis from manual coding had
been missed by the enhanced autocoder. The list of `major missed diagnoses'
was obtained as follows: First, we assembled a list of `minor diagnoses',
such as `M09450 NO EVIDENCE OF MALIGNANCY', `M00100 NORMAL TISSUE MORPHOLOGY,
NOS', as well as non-specific inflammation, such as `M41000 INFLAMMATION,
ACUTE, NOS', `M43000 INFLAMMATION, CHRONIC, NOS', etc. A minor diagnosis
in the manual coding was not required to find a match in the autocoder
diagnoses. Second, a list of near-synonyms was assembled, such as
`M81400 ADENOMA' near-synonym for `M82110 `TUBULAR ADENOMA'.
A major diagnosis in the manual coding was considered matched
if its near-synonym appeared in the autocoder diagnoses.
Finally, a match was only required in the first three digits
of the SNOMED-code (where the first digit is either `M' or `T').
Thus, `M72000 HYPERPLASIA' was considered a match for
`M72400 HYPERPLASIA, GLANDULAR AND STROMAL'.
RESULTS.
A total of 9,353 cases was examined over the 33-month duration of the study.
In the first pass of the enhanced autocoder, 463 (5%) discrepant cases
were detected, in which a major diagnosis in the manual coding had been
missed by the enhanced autocoder. These cases were reviewed by an
experienced human coder, who assigned true-positive codes for each case,
based solely upon the information available in the source-text-stream
available to the autocoder. In many of the initially discrepant cases,
manually-entered codes were based on clinical information not present in the
`specimen source' or `final microscopic diagnosis' sections of the report,
and thus were inaccessible to the autocoder. In some cases, manually-entered
codes were based on misspelled words in the source or diagnosis sections.
Again, these manually-entered codes could not reasonably be detected by the
autocoder, and were removed from the list of true-positive manual codes.
In rare cases, the manually-entered codes were simply wrong. The final set
of true-positive diagnoses assigned to the initially discrepant cases,
was passed through the enhanced autocoder again. In this second pass
through the autocoder, there was a missing, major, true-positive diagnosis
in only 44 (0.5%) cases. This result suggests that a well-maintained
autocoder can determine the major diagnoses in 99.5% of cases with
no data-entry errors, but in our service, an additional 4.5% of cases
had major, missed diagnoses due to data entry errors in the free text fields
scanned by the autocoder.
Table 1
shows a distribution of the 25 most common, distinct morphology codes
obtained by manual coding, ranked in descending frequency of occurrence,
and accounting for 9,156 (68.1%) of all diagnoses made in the period
of study. The most common manual diagnosis was `M43000 INFLAMMATION,
CHRONIC', present in 778 (8.3%) of cases. The 25 most common diagnoses
are characteristic of our patient population, consisting predominantly
of middle-aged men. Table 2 shows
a distribution of the 25 most common, distinct morphology codes
obtained by the enhanced autocoder, ranked in descending frequency
of occurrence, and accounting for 14,498 (61.1%)
of all enhanced autocoder diagnoses in the period of study.
The most common enhanced autocoder diagnosis was `M09450 NO EVIDENCE
OF MALIGNANCY', present in 1,564 (16.7%) of all cases. The other common
diagnoses obtained by the autocoder are similar to manual codes, except that
the autocoder appears to be more complete in assigning minor diagnoses.
Table 3 and Table 4
summarize the behavior of manual coding, the simple autocoder,
and the enhanced autocoder, for morphology and topography codes.
In both cases, it is apparent that the simple autocoder obtains
a poor result compared to the enhanced autocoder, whereas the enhanced
autocoder has behavior quite similar to manual coding. For example,
the simple autocoder obtains almost three times as many morphology codes
per case as the enhanced autocoder, because the simple autocoder assigns
many words in the specimen source or final microscopic diagnosis to nonsense
SNOMED codes. The most common morphology code assigned (erroneously)
by the simple autocoder was `M14070 WOUND, BIOPSY', because the word `biopsy'
appears in many specimen source texts. The simple autocoder does not have
the one-word term, `biopsy', in its dictionary, and thus takes the two-word
term, `WOUND, BIOPSY', which includes the word `biopsy'.
In a surgical pathology service with a stable patient population,
a few diagnoses and a few specimen sites should account for a majority
of the specimens seen. As shown in Table 2, the `median morphology code'
(i.e., the 50-percentile morphology code representing the halfway point
in the morphology code ranking) for manual coding occurs at rank 14.
This means that at least 50% of all manual morphology codes are covered by
the 14 most frequent (i.e., highest-ranking) diagnoses. The `80-percentile
morphology code' for manual coding occurs at rank 42. This means that
at least 80% of all manual morphology codes are covered by the 42 most
frequent diagnoses. Finally, at least 90% of all manual morphology codes
are covered by the 89 most frequent diagnoses. A similar distribution of
percentiles is seen for morphology codes assigned by the enhanced autocoder,
but a much more heterogeneous percentile-ranking is obtained by the simple
autocoder. Analogously, topography coding is fairly narrow for manual coding
and the enhanced autocoder, but more heterogeneous for the simple autocoder
(Table 4).
DISCUSSION.
The nomenclature for automatic coding is somewhat vague. The term
`computer-assisted coding' has been used to refer to a variety of distinctly
different activities. Our impression is that the term `computer-assisted
coding' describes a system where the person entering data is prompted by the
computer to enter the name of a topographic site or morphologic entity.
The computer then points to a matching entry, if any, in the SNOMED file.
If there is a match, then the computer reports the code number assigned
to the matching file entry. If there is no match, then the user is prompted
to enter another morphologic diagnosis or topography. Such a system
is currently used in Veterans Affairs Medical Centers. It is our experience
that most pathologists regard this form of coding as `manual' coding,
since the pathologist must manually re-enter the specimen source
and final microscopic diagnoses for every specimen. This system is faster
than searching for diagnoses in the SNOMED books, but is not as fast as
having the computer extract codes from the free text report. Another problem
with coding based on searching a computer dictionary is that there is seldom
a `browse' mode that permits the user to search for an optimal diagnostic
term. After a few input terms are returned unmatched, all but the most
devoted coder will settle for a `generic' diagnosis that broadly includes
the lesion of interest. For example, the pathologist may yield to the
temptation of diagnosing every non-neoplastic skin condition under the term
`inflammation'. In our opinion, `computer-assisted coding' would also include
systems where the user must enter simplified terminology for diagnosis
or topography into specified data fields.
We use the term `automated coding' to describe systems in which the computer
does all of the work of coding, with no user interaction. In these systems,
the pathology report is written with no special regard for the coding process
that will follow. The computer scans the entire report or that portion
of the report designated to contain diagnostic information. Sentences are
`parsed' by context-sensitive grammatical rules into phrases. These phrases
are matched against entries in an electronic dictionary that may or may not
be enhanced from the raw dictionary supplied by the coding system
(e.g., SNOMED,
[3]
ICD,
[15]
Mesh,
[16]
Read,
[17]
etc.)
In the current study, an automated coder (`autocoder') read and coded
9,353 surgical pathology reports that had previously been coded using the
standard Veterans Affairs computer-assisted SNOMED coding package.
Automatic coding was performed by two different methods: simple coding,
in which the coder simply reads the consecutive words of the report and
searches for match-words in the coding dictionary; and enhanced searching,
in which the coder reads through the report, parses the text into phrases,
and matches phrases against a dictionary that had been enhanced to include
not only SNOMED terms, but related terms pointing to SNOMED terms.
For instance, `vulva' would point to `vulvar', so that either
`vulvar carcinoma' or `carcinoma of the vulva' would match
the same SNOMED code.
Measuring the quality of coding is a difficult task, and doubtless
the complexities have contributed to the lack of scientific literature
available in this area. To a large extent, the quality of coding
is determined by the intended purpose of the coding database. At present
there are four popular coding databases available to pathology departments
for indexing and retrieving reports by diagnostic and topographic content
and currently in wide use in the USA. These are: SNOMED
[3],
ICD-9
[15],
and MESH
[16].
A fourth coding system, the Read system,
[17]
is used
primarily in Great Britain.
SNOMED provides codes for seven dimensions of report descriptors,
including topography, morphology, etiology, function, disease, procedure,
and occupation. Our experience has been that most pathology departments
typically code under Morphology and Topography and ignore the other
descriptors. In theory, SNOMED is a six-digit hierarchical system,
with the most general terms described by the first two digits
and more specific information carried by the succeeding three digits.
The problem with hierarchical systems is that one person's concept
of topographic or morphologic hierarchy may not fit another person's concept.
A single disease entity such as a decubitus ulcer of the may be coded under
a number of different morphology codes, including `decubitus' (M10540),
`ulcer' (M38000), `inflammation' (M40000), `inflammation, chronic ulcerative'
(M43030), `inflammation, necrotizing' (M40700), or `inflammation,
ulcerative' (M40030). The topography codes for a decubitus ulcer
might include skin (T01000), skin of thigh (T02810), skin of posterior
surface of thigh (T02812). These morphology codes exemplify the partially
non-hierarchical character of SNOMED. If the pathologist codes the case
as decubitus, a search under the term for ulcer or for chronic ulcerative
inflammation would not recover the case. Furthermore, a hierarchical search
under the three-digit leader either for decubitus (M10), ulcer (M38),
or chronic ulcerative inflammation (M43), would fail to recover the case
coded under either of the alternate morphology listings. The same is true
of the topography code, as a code under the leading 3-digit string
for skin (T01) would fail to recover cases listed for the leading string
of skin of thigh (T02). In order to assure recovery of the case,
the pathologist would need to code under all applicable morphology
and topography codes, a prodigious undertaking. An additional drawback
of SNOMED is its strictly proprietary nature. As a commercial product
owned by the College of American Pathologists, all SNOMED users
must purchase licensed copies of the code dictionary. This makes
it difficult for software developers to market their automatic coders
as a complete package including the SNOMED dictionary, especially
if they wish to expand the dictionary with synonym and misspelling pointers.
Our confusion with these aspects of SNOMED coding is reflected
in the complex strategy that we finally settled upon for comparing
manual coding to results of the enhanced autocoder. First, we assembled
a list of `minor diagnoses', such as `M09450 NO EVIDENCE OF MALIGNANCY',
which were not required to find a match among the autocoder diagnoses. Second, a list of near-synonyms was assembled, such as `M81400 ADENOMA' near-synonym for `M82110 `TUBULAR ADENOMA', in which the manual coding was considered matched if its near-synonym appeared among the autocoder diagnoses. Third, a match was only required in the first three digits of the SNOMED-code, so that, say, `M72000 HYPERPLASIA' was considered a match for `M72400 HYPERPLASIA, GLANDULAR AND STROMAL'. Finally, we found it necessary to have a `dictionary policeman', who reviewed all new encounters with previously unused phrases occurring in our natural language text file, and pointed these phrases to appropriate SNOMED codes. By contrast, a `simple autocoder', which employs a direct word-match between the source text and the SNOMED dictionary, performed quite poorly. As shown in Table 3, the simple autocoder obtained a heterogeneous distribution of codes. Many of these code-assignments were nonsense, because the simple autocoder assigns many words in the specimen site or final microscopic diagnosis to SNOMED codes which fortuitously happen to contain those words (e.g., BIOPSY pointed to WOUND, BIOPSY); and the simple autocoder fails to assign codes for slight word variations (e.g., ASCITES not pointed to ASCITIC FLUID).
Remarkably, the enhanced automated SNOMED coding strategy resulted in only 0.5% missed major SNOMED codes by the autocoder as compared to the spell-corrected manual codes. These missed major codes were the result of complex syntax in the source text stream, which would require a sophisticated parsing algorithm.
[13]
This result suggests that perfect orthography in the source text and vigilant dictionary maintenance are sufficient to achieve highly accurate coding. Complex parsing algorithms, available in computer translators such as TRANSOFT, could not be expected to increase coding accuracy to an appreciable extent.
The Medical Subject Headings (MeSH) codes of the United States National Library of Medicine has been used as a universal language and code dictionary for medical text.
[16]
Moore et al matched MeSH terms to pathology text words and phrases from narrative text of 4,591 autopsy reports from The Johns Hopkins Hospital.
[14]
This matching permits computerized searches through the autopsy database by MeSH term. The MeSH term code dictionary has three important advantages over SNOMED. First, all MeSH terms are keyed to the National Libary of Medicine on-line databases, assuring that coded items from the departmental database will be acceptable Medline search topics. Secondly, MeSH terms permit single entities to be coded under more than one hierarchy, and compensates for redundancies by adding pointers between redundant codes. For instance, `cystic fibrosis' can be regarded as a neonatal disease (C16.614.213), as a pulmonary disease (C8.381.187), or as a pancreatic disease (C6.689.202). In the MeSH system, redundancies are connected by dictionary pointers, so that a search for `cystic fibrosis' under any of the three codes will point to the other code alternates. Thirdly, the National Library of Medicine permits software developers to use MeSH freely in indexing applications. This means that commercial coding applications may encapsulate the MeSH dictionary in their distributed products. The major disadvantage of MeSH is that its nomenclature lacks the detail and scope of SNOMED.
The International Statistical Classification of Diseases, Injuries,
and Causes of Death, ninth edition (ICD-9) was constructed primarily
to support statistical studies of the diseases occurring in health care
regions.
[15]
More recently, ICD-9 codes have been linked to DRG
(Diagnosis Related Groups). The relative value of a coding language depends
upon the intended purpose of the coded database.
Pathologists may code with the intention of optimizing their chances
of recovering the case at some later time. A pathologist may choose to code
a single case of vocal cord dysplasia under multiple related morphologic
or topographic terms to insure the success of some future search
(e.g., cytologic atypia, precancer, dysplasia, carcinoma in situ,
squamous carcinoma, vocal cord, larynx, neck). An epidemiologist trying
to determine the respective incidences of cord dysplasia and cord carcinoma
may be perplexed by the many code listings for a single biopsy specimen.
We find it interesting that no specific strategy for coding has been offered
to pathologists or to vendors of coding software telling us whether we should
be choosing the a single `best fit' diagnosis for a lesion or whether
we should assure inclusivity of coding with multiple related terms.
This question will have greater relevance when administrators
and epidemiologists attempt to use collected code databases.
In summary, our findings support the following conclusions:
(1) fully automatic SNOMED coding is a practical alternative
to manual SNOMED coding;
(2) automated SNOMED coding of 9,353 surgical pathology reports
at the Baltimore VA Medical Center was superior to manual coding
in several measurable categories, including the overall number of codes
generated and the number of distinct code entities provided;
(3) departments can improve automated SNOMED coding by writing reports
in a clear and unambiguous style; by enforcing correct orthography;
and by expanding the (electronic) code dictionary with terms (synonyms)
used in the department but not contained in the formal SNOMED nomenclature;
and (4) departments may monitor automated coding as a regular
quality assurance activity leading to improved patient care.
Table 5
summarizes suggested guidelines for a QA monitor pathology
departments may use to evaluate and improve autocoder performance.
The overall evaluation of coding activities requires a clear understanding
of the purposes of coding. Currently, coding in pathology departments
is done primarily so that reports of a certain lesion or location
can be recovered by the pathologist. In the near future, coding activities
may relate more closely to broader questions of regional, national,
and international importance. Once uses of coded reports become prioritized,
and an optimal coding dictionary can be chosen. Additionally,
coding algorithms can be designed to minimize errors
based on the intended uses of the codes.
TABLE 1.
SNOMED MORPHOLOGY CODES FOR
9,353 CONSECUTIVE PATHOLOGY REPORTS:
25 MOST FREQUENT MANUAL CODES.
NUMBER CUMULATIVE SNOMED DESCRIPTION
OF CASES NUMBER CODE
1 778 778 M43000 Inflammation, chronic, NOS
2 776 1554 M41000 Inflammation, acute, NOS
3 742 2296 M72000 Hyperplasia
4 619 2915 M00100 Normal tissue morphology, NOS
5 615 3530 M81403 Adenocarcinoma, NOS
6 571 4101 M09450 No evidence of malignancy
7 492 4593 M80903 Basal cell carcinoma, NOS
8 396 4989 M80703 Squamous cell carcinoma, NOS
9 376 5365 M40000 Inflammation
10 356 5721 M82110 Tubular adenoma, NOS
11 321 6042 M54000 Necrosis
12 318 6360 M72400 Hyperplasia, glandular an
13 285 6645 M51100 Cataract, NOS
14 282 6927 M38000 Ulcer
15 269 7196 M72040 Hyperplasia, polypoid
16 258 7454 M72600 Hyperkeratosis, NOS
17 251 7705 M49000 Fibrosis
18 209 7914 M45020 Granulation tissue, NOS
19 208 8122 M72750 Keratosis, seborrheic
20 192 8314 M72850 Keratosis, actinic
21 187 8501 M33410 Cyst, epithelial inclusion
22 186 8687 M09460 Negative for tumor cells
23 165 8852 M31680 Hernia sac
24 165 9017 M58000 Atrophy
25 139 9156 M30000 Calculus
TABLE 2.
SNOMED MORPHOLOGY CODES FOR
9,353 CONSECUTIVE PATHOLOGY REPORTS:
25 MOST FREQUENT ENHANCED AUTOCODER CODES.
NUMBER CUMULATIVE SNOMED DESCRIPTION
OF CASES NUMBER CODE
1 1564 1564 M09450 No evidence of malignancy
2 1397 2961 M00100 Normal tissue morphology, NOS
3 915 3876 M72400 Hyperplasia, glandular and stromal
4 856 4732 M43000 Inflammation, chronic, NOS
5 817 5549 M09010 Tissue insufficient for diagnosis
6 761 6310 M41000 Inflammation, acute, NOS
7 686 6996 M76800 Polyp
8 601 7597 M81403 Adenocarcinoma, NOS
9 540 8137 M40000 Inflammation
10 527 8664 M38000 Ulcer
11 525 9189 M54000 Necrosis
12 522 9711 M49000 Fibrosis
13 509 10220 M80903 Basal cell carcinoma, NOS
14 460 10680 M01100 Lesion, NOS
15 460 11140 M42100 Acute and chronic inflammation
16 457 11597 M82110 Tubular adenoma, NOS
17 454 12051 M80703 Squamous cell carcinoma, NOS
18 361 12412 M72040 Hyperplasia, polypoid
19 360 12772 M69700 Atypia
20 345 13117 M72600 Hyperkeratosis, NOS
21 331 13448 M45020 Granulation tissue, NOS
22 304 13752 M58000 Atrophy
23 290 14042 M51100 Cataract, NOS
24 232 14274 M72020 Hyperplasia, secondary
25 224 14498 M33410 Cyst, epithelial inclusion
TABLE 3.
SNOMED MORPHOLOGY CODING OF
9,353 CONSECUTIVE PATHOLOGY REPORTS:
MANUAL CODING, SIMPLE AUTOCODER,
AND ENHANCED AUTOCODER.
MANUAL SIMPLE ENHANCED
CODING AUTOCODER AUTOCODER
# of morphology 13,454 66,865 23,744
codes
average # of 1.4 7.1 2.5
morphology codes
per specimen
# of morphologic 519 1,130 498
entities
# of unique
morphologic entities 209 248 129
# (%) of specimens
with most common 778 (8.3%) 5,689 (60.8%) 1,564 (16.7%)
morphology code
rank of the 50-percentile
morphology code 14 29 17
rank of the 80-percentile
morphology code 42 127 58
rank of the 90-percentile
morphology code 89 238 102
Rank is determined by the frequency of occurrence of the code.
For example, a rank for the 80-percentile code of 42 means that
the 42 most common codes accounted for at least 80% of all the
code entries.
TABLE 4.
SNOMED TOPOGRAPHY CODING
OF 9,353 CONSECUTIVE PATHOLOGY REPORTS:
MANUAL CODING, SIMPLE AUTOCODER,
AND ENHANCED AUTOCODER.
MANUAL SIMPLE ENHANCED
CODING AUTOCODER AUTOCODER
# of topography 10,235 16,409 24,328
codes
average # of 1.1 1.8 2.6
topography codes
per specimen
# of topographic 404 949 602
entities
# of unique
topographic entities 142 284 196
# (%) of specimens
with most common 2,023 (21.6%) 941 (10.1%) 2,386 (25.5%)
topography code
rank of the 50-percentile
topography code 6 35 16
rank of the 80-percentile
topography code 38 154 75
rank of the 90-percentile
topography code 86 285 136
Rank is determined by the frequency of occurrence of the code.
For example, a rank for the 80-percentile code of 38 means that
the 38 most common codes accounted for at least 80% of all the
code entries.
TABLE 5.
SUGGESTED GUIDELINES FOR QUALITY ASSURANCE
OF A DEPARTMENTAL SNOMED AUTOCODER.
1. Assemble a representative subset of cases with manual and autocoder SNOMED diagnoses.
2. Compare the coding results and compile:
a. a list of diagnostic terms used in your department
that were missed by the autocoder.
b. a list of `minor diagnoses' (negatives, non-specific inflammation),
which the autocoder is not required to detect.
c. a list of synonyms, which should be regarded as equivalent
between manual coding and autocoder.
d. a list of discrepant cases, in which a major site or diagnosis in the initial manual coding has no match (in the first three digits) and no synonym among the autocoder sites and diagnoses.
3. Update the autocoder dictionary, based on any deficiencies
detected in step #2.
4. Suggest changes in report syntax based on findings in step #2.
5. Repeat coding monitor until improvement plateaus.
REFERENCES.
1. Erlander D.
Computer data processing of medical diagnoses in pathology.
Am J Clin Pathol. 1975; 63:538-544.
2. Coles EC, Slavin G.
An evaluation of automatic coding
of surgical pathology reports.
J Clin Pathol 1976;29:621-625.
3. College of American Pathologists.
Systematized nomenclature of medicine (SNOMED).
Skokie, IL: College of American Pathologists, 1976;:.
4. Cote RA, Robboy S.
Progress in Medical Information Management:
systematized nomenclature of medicine (SNOMED).
JAMA. 1980;243:756-762.
5. Hall PA, Lemoine NR.
Comparison of manual data coding errors in two hospitals.
J Clin Pathol. 1986;39:622-626.
6. Earlam R.
Korner, nomenclature and SNOMED.
Brit Med J. 1988;296:903-905.
7. Dodd W.
Korner, nomenclature and SNOMED (letter).
Brit Med J. 1988;296:1198-1199.
8. Sorace JM, Carnahan GW, Moore GW, Berman JJ.
Automated review of blood donor screening test patterns
at a regional blood center.
Am J Clin Pathol 1992; 98:334-344.
9. Furfine CS.
The FDA's policy on the regulation of computerized medical devices.
M.D. Computing 1992;9:97-100.
10. Brannigan VM.
Software quality regulation under the Safe Medical Devices
Act of 1990: Hospitals are now the canaries in the software mine.
Proc Annu Symp Compu Applic Med Care 1991;15:238-242.
11. Davis RG.
FileMan: A User Manual.
Bethesda, MD: National Association of VA Physicians. 1987;:.
12. Moore GW, Berman JJ.
Object-oriented English-to-Snomed translator
using TRANSOFT+HyperPAD.
Symposium on Computer Applications in Medical care. 1991; 15:973-975.
13. Moore GW, Wakai I, Satomura Y, Giere W.
TRANSOFT: Medical translation expert system.
Artif Intell Med. 1989;1:149-157.
14. Moore GW, Miller RE, Hutchins GM.
Indexing by MeSH titles of natural language pathology phrases
identified on first encounter using the barrier word method.
In: Computerized Natural Medical Language Processing
for Knowledge Engineering. Scherrer JR, Cote RA, Mandil SH (eds.),
Elsevier Science Publishers, North Holland. 1989;:29-45.
15. The International Classification of Diseases, 9th Revision: ICD-9CM,
Second Edition.
U.S. Department of Health and Human Services, Public Health
Service, Health Care Financing Administration, U.S. Government Printing
Office, 1980;:.
16. National Library of Medicine UMLS Knowledge Source.
US Department of Health and Human Services,
National Institutes of Health, National Library of Medicine, 1991.
17. Read JD, Benson TJR.
Comprehensive coding.
Brit J Health Care Computing 1986; 3:22-25.
Last updated: 1/26/2008, by
G. William Moore, MD, PhD.