PERFORMANCE ANALYSIS OF MANUAL AND AUTOMATED
SYSTEMATIZED NOMENCLATURE OF MEDICINE (SNOMED) CODING.
G. William Moore, MD, PhD. [1,2,3]
Jules J. Berman, PhD, MD. [1,2]
7/10/2009.
http://www.netautopsy.org/autocode.htm



From the Pathology and Laboratory Medicine Service, Veterans Affairs Maryland Health Care System, Baltimore, Maryland [1]; Department of Pathology, University of Maryland Medical System, Baltimore, Maryland [2]; and Department of Pathology, The Johns Hopkins Medical Institutions, Baltimore, Maryland [3].

Moore GW, Berman JJ.
Performance analysis of manual and automated systematized nomenclature of medicine (SNOMED) coding.
Am J Clin Pathol. 1994 Mar;101(3):253-256.
PMID: 8135178.
PubMed Entry
Full Text: http://www.netautopsy.org/autocode.htm

Send comments and correspondence to: George.Moore4@va.gov


Related Publications:
Anatomic Pathology Data Mining: http://www.netautopsy.org/apdmchap.htm
Automated Edge Detection, Pathology Images: http://www.netautopsy.org/ascpedge.htm
Fractal Dimensions in Pathology: http://www.netautopsy.org/ascpfrac.htm
Image Segmentation, Analysis: http://www.netautopsy.org/ascpisap.htm
Automated SNOMED Coding: http://www.netautopsy.org/autocode.htm
Anatomic Pathology Procedure Manual: http://www.netautopsy.org/axsop/axsop.htm
Basal Cell Carcinoma, Histologic Discontinuities: http://www.netautopsy.org/basalcel.htm
DNA Analysis, Cardiac Myxoma: http://www.netautopsy.org/camyxoma.htm
Cell Death, Preneoplasia: http://www.netautopsy.org/celdeath.htm
Clear Cell Dysplasia, Bladder: http://www.netautopsy.org/clearcel.htm
Maintaining Patient Confidentiality: http://www.netautopsy.org/confiden.htm
Elevated PSA, African-American Males (Lancet): http://www.netautopsy.org/epsalanc.htm
Elevated PSA, African-American Males (Mod Pathol): http://www.netautopsy.org/epsamopa.htm
Bibliography, Staged Human Embryos: http://www.netautopsy.org/embrbibl.htm
Image Segmentation, Analysis, Pathology (ISAP): http://www.netautopsy.org/isapwlcm.htm
Johns Hopkins Autopsy Resource: http://www.netautopsy.org
Bibliography, Johns Hopkins Autopsy Resource: http://www.netautopsy.org /jharpubl.htm
Autopsy Report Words, Johns Hopkins Autopsy Resource: http://www.netautopsy.org /jharaurw.htm
Zipf Distribution, Johns Hopkins Autopsy Resource: http://www.netautopsy.org /jharzipf.htm
DNA Flow Cytometry, Keratoacanthoma: http://www.netautopsy.org/keratflw.htm
Dysplasia, Atypical Liver Nodule. http://www.netautopsy.org/lvrdyspl.htm
Cell Simulation, Polyclonal Tumors: http://www.netautopsy.org/monoclon.htm
SNOMED-Encoded Surgical Pathology Databases: http://www.netautopsy.org/snomedsp.htm
Pathology Natural Language Processing: http://www.netautopsy.org/natlngpr.htm
Practice Guidelines, Autopsy Pathology: http://www.netautopsy.org/pracguid.htm
Internet Autopsy Database: http://www.netautopsy.org/protoiad.htm
Internet-based Quality Improvement: http://www.netautopsy.org/qimpmopa.htm
Unfunded Research, Pathologists, Internists, Surgeons: http://www.netautopsy.org/unfunded.htm
Uniqueness, Medical Data Mining: http://www.netautopsy.org/uniqmddm.htm
Linguistic Inventory, Johns Hopkins Surgical Pathology: http://www.netautopsy.org/vhpsapsx.htm
Developmental Neoplasm Lineage: http://www.julesberman.info/devclass.htm
Biomedical Informatics: http://www.jbpub.com/catalog/9780763741358/
Neoplasms, Development, Diversity: http://www.jbpub.com/catalog/9780763755706/
Precancer: http://www.jbpub.com/catalog/9780763777845/

Last tested: July 9, 2009.

0. DISCLAIMER.



DISCLAIMER. United States Government Work, uncopyrighted, public-domain, DRAFT COPY ONLY. This document does not necessarily represent the views or policies of any United States Government agency. This document is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the authors be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of, or in connection with the document or the use or other dealings made with the document.



1. ABSTRACT.


Many pathology departments rely on the accuracy of computer-generated diagnostic coding for surgical specimens. At present, there are no published guidelines for assuring the quality of coding devices. To assess the performance of SNOMED coding software, manual coding was compared with automated coding in 9,353 consecutive surgical pathology reports at the Baltimore VA Medical Center. Manual SNOMED coding produced 13,454 diagnostic entries comprising 519 distinct diagnostic entities; 209 were unique diagnoses (assigned to only one of the 9,353 reports). Automated coding obtained 23,744 diagnostic entries comprising 498 distinct diagnostic entities, of which 129 were unique diagnoses. There were only 44 instances (0.5%) where automated coding missed key diagnoses on surgical case reports. In summary, automated coding compared favorably with manual coding. To achieve the maximum performance from software coding applications, departments should monitor the output from automatic coders. Modifications in reporting style, code dictionaries, and coding algorithms can lead to improved coding performance.
key words: SNOMED, MUMPS, quality assurance, translation, software, pathology, code



2. INTRODUCTION.


Coding pathology reports has become an important activity for laboratories of anatomic pathology. Once regarded solely as a means for research-oriented pathologists to recover interesting cases, diagnostic coding has become a means of linking pathology services with other hospital services rendered on a patient and billed to third parties. Inaccurate diagnostic coding may cause a report to be uncountable, irretrievable, or unreimbursable. Coded reports permit pathologists to complete diagnosis-specific quality assurance activities, and compile statistical data on the types of specimens received in the department. In the future, coded databases, stripped of patient identifiers and collected from many contributing health care services, may assist epidemiologists in tracking the spread of diseases, identifying areas of special risk, and providing reliable quantitative information for developing national health care policies.


      The difficulties encountered in coding have received scant attention.
[1, 2] The College of Anatomic Pathologists, copyright-holder for SNOMED (Systematized Nomenclature of Medicine), does not address the problems of who should code, how much time is needed to code, how often coding errors may occur, and how to cope with coding errors. [3, 4] To our knowledge, there are only a few reports in the literature that address the problem of manual coding inaccuracies. Hall and Lemoine, [5] in one of the few such studies, found errors in more than 10% of cases. They divided manual coding errors into five types:
(1) Factually correct but unhelpful codes (e.g., coding all benign lesions as `negative for tumor');

(2) Inconsistent codes (coding `dysplasia' on Monday and `atypia' on Tuesday);

(3) Idiosyncratic codes (using a mnemonic for a lesion, often inscrutable to other people);

(4) Entry errors (e.g., entering `lipoma' when one intends to enter `lymphoma');

(5) Incomplete coding due to impatience or laziness.


Who or what should code? Certainly, coding by a clerk saves the pathologist's time, but does it accomplish the job adequately? Some hospitals employ professional coders trained to list diagnoses in a manner that supports linkage to reimbursable diagnosis related groups (DRGs). Professional coders may generate revenue for the hospital, but they command a high salary, and they may not code lesions in a manner that allows the pathologist to retrieve specimens of academic or clinical interest. In England, the Korner committee recommended that the National Health Service's reliance on coding by lay personnel should be abandoned, and that physicians do their own coding. [6, 7] No one is more familiar with a report than the pathologist who signs it. The question remains, can pathologists be expected to thoroughly code all their reports on a daily basis?

Considering the problems with human coding, the incentives for accurate automated diagnostic coding are obvious, and a variety of software systems that perform automated coding (`autocoders') are commercially available. In science, business, and many areas of medicine, the public has come to accept computer-generated results as reliable, often more reliable than results generated by humans. In the field of medicine, small errors in the way that computers handle data can result in catastrophe. [8] This is particularly true in areas that depend heavily on contextual interpretation of language, such as diagnostic coding.

Many pathology departments do not wish to become entangled in the problem of validating the software they purchase. When hundreds of thousands of dollars are spent on a laboratory information system, the departments expect product validation to be completed by the vendor and approved by the Food and Drug Administration (FDA), the government agency responsible for medical devices. Unfortunately, the FDA, under the Safe Medical Devices Act of 1990, has shifted much of its oversight activities from the software vendor to the software buyer (i.e., from premarket approval to postmarket surveillance). [9] Health care facilities using software devices are required to report product defects to the FDA, which can rapidly suspend approval of devices that went to market with minimal agency oversight. [10]

Is it realistic to expect commercial vendors to perform any quality assurance on their automated coders, other than to assure that the autocoders yield coded diagnoses without causing system crashes, and that the diagnoses should be retrievable by code number or by diagnostic terms that match code numbers? The software vendor cannot really test whether the autocoder is operating accurately at any given institution, because reports at that institution may be written in an idiosyncratic manner that makes reliable coding impossible. As an example, some pathologists may wish to abbreviate diagnoses in their report. The autocoder would not necessarily provide a code for CLL, TCC, BCC, etc., unless it has a dictionary of the abbreviations commonly used in that department. Since abbreviations are not included in the SNOMED dictionary, automatic coders would perform poorly in departments that use abbreviated diagnoses. To correct that problem, the abbreviation would have to be added to the electronic dictionary that links diagnostic terms with SNOMED codes. Similar problems might arise in departments where the reports are not scrutinized carefully for spelling errors, or that use grammatically challenging sentence structures. Consider the problems faced by a computer program in coding the following sentence: `Neither metastatic squamous cell carcinoma nor primary infiltrative processes can be ruled out, as well as the seborrheic keratosis, which is present.'

It is in the interest of every department that uses an autocoder to evaluate performance on their own reports, and to devise a program to enhance performance by expanding the diagnostic dictionary, or by changing the standard word, phrase, or sentence format (syntax) of their reports. In addition, departments should have a way of determining whether the changes they make actually improve the autocoder's performance. In the present study, we compare the results of automated coding with the results of coding performed by anatomic pathologists at the Baltimore VA Medical Center. Based on these results, we recommend guidelines for writing reports and enhancing the content of the code dictionaries to improve performance of automatic coding software.


3. MATERIAL AND METHODS.


Materials. All surgical pathology reports accessioned consecutively between October 1, 1989, and June 30, 1992, at the Baltimore VA Medical Center were examined.
      Manual Coding. Manual coding was performed by three board-certified anatomic pathologists at the Baltimore VA Medical Center. These pathologists were acquainted with the SNOMED system, including the categorization of code information into the six fields of topography, morphology, etiology, function, procedure, and disease. A seventh field, `occupation', is not included in the VA SNOMED package. Manual coding was performed with the assistance of an on-line dictionary of SNOMED codes licensed to the Department of Veterans Affairs, and included in the standard VA anatomic pathology information system package, version 4.1. On a daily basis, during the computer session in which the pathologist electronically signs, or `releases' reports for general hospital access, the pathologist enters terms into the various SNOMED fields. Although all six SNOMED fields are accessible to the pathologist, only topography and morphology fields are default selected by the computer system, and the pathologist must request special access to the fields for etiology, function, procedure, and disease, through a cumbersome user interface. Nearly all cases signed out in our department have only topography and morphology codes. When the pathologist enters a term at the prompt, the computer selects a match and displays the match term and its corresponding SNOMED code. The pathologist is given an opportunity to delete the SNOMED code, if desired.

Hardware. The computer used for the present study was an IBM PC/AT-compatible computer (COMTEX, 30368 microprocessor, 25MHz, 330 Mb Priam hard disk), programmed with American National Standard MUMPS (MGlobal, Inc., Houston, TX), and the public-domain File Manager (FileMan) database management system of the United States Department of Veterans Affairs, [11]
used routinely in 169 VA medical centers.

Input Data. Reports were obtained as a raw global ASCII file downloaded from the mainframe computer at the Baltimore VA Medical Center, and containing the complete text of all consecutive surgical pathology reports obtained between October 1, 1989, and June 30, 1992. The entire contents of each report, including patient demographics, date and time of accessioning and signout, specimen source, gross description, final microscopic diagnosis, pathologist's identification, and manually-entered SNOMED codes, were passed into the ASCII file, a total of 21,168,261 bytes. The full text of the `specimen source' and `final microscopic diagnosis' for each case served as source text for the SNOMED autocoder. All numbers and punctuation marks were removed from the source-text-stream, as well as all letter-strings shorter than 3 letters, except for: `no', `os' (=`bone' or `left eye'), `od' (=`right eye'), `eg' (=`esophago-gastric'), and `ge' (=`gastro-esophageal').

Software. Automated coding of free-text diagnoses into SNOMED codes was performed on TRANSOFT, a table-driven public-domain computer translation shell, written in MUMPS or HyperPAD. [12, 13] The MUMPS-version of TRANSOFT, used in the present investigation, employs the file structure of FileMan. The key elements of TRANSOFT, including algorithms, parsing rules, general applications, and specific application as an automatic SNOMED coder, have been discussed elsewhere. [12, 13]


      Topography and morphology codes (SNOMED dictionary) were downloaded from the VA-licensed subset of SNOMED into an external file serving as TRANSOFT's dictionary. For each SNOMED-code in the VA subset, there is a main term and any number of synonyms. For example, the topography code `TX1000' has `CEREBROSPINAL FLUID' as its main term and `SPINAL FLUID', `CSF', and `FLUID, SPINAL' as synonyms. Two sentence-parsing models were used: simple coding and enhanced coding.


      Simple Coding Model. In the simple coding model, a single word in the source-text-stream finds a SNOMED-match if the word is present among the words of the main term or synonyms for a that SNOMED-code in the VA-subset. In case of multiple matches, a single match is selected arbitrarily. For example, `cerebrospinal' in the source-text-stream has exactly one match, namely `TX1000 CEREBROSPINAL FLUID'. Two consecutive words in the source-text-stream find a SNOMED-match if both words are present among the words of the main term or synonyms for a particular SNOMED-code. A two-word match always supersedes a one-word match. Three-word, four-word,... matches are attempted, with a longer word-match always superseding a shorter word-match. Thus in the simple coding model, the consecutive words `cerebrospinal fluid' in the source text stream would obtain a unique match to the topography code, TX1000.

Enhanced Coding Model. The disadvantage of the simple coding model is its inability to capture the local language usage for a particular group of pathologists. For example, the topography code for `PERITONEAL FLUID' has only `ASCITIC FLUID' and `FLUID, ASCITIC' as synonyms in the VA-licensed subset of SNOMED, whereas pathologists in our department are as likely to use the term `ASCITES FLUID' in our free text. There is no occurrence of the word `ASCITES' in the VA-subset of SNOMED, so that such a case would fail to be coded by the simple coding model. In the enhanced coding model, we obtained a list of all the diagnostic terms used in our department over the 33-month period of study. This is accomplished by creating a list of all one-word, two-word, three-word,... terms bounded on either side by punctuation marks, numerals, or `barrier words' (i.e., prepositions, conjunctions, articles, etc.).[14] These diagnostic terms are then pointed to one or more appropriate SNOMED codes. For example, the diagnostic term, `basal cell carcinoma' points both to M80903 (=`BASAL CELL CARCINOMA') and to T01000 (=`SKIN'). As will be shown below (RESULTS and DISCUSSION), a more sophisticated parsing model than this phrase-match model does not appear to be warranted.

False-negative and False-positive rates. A `false-negative case' is one to which a correct code for a major diagnosis has not been assigned. A `false-positive case' is one to which an incorrect code for a major diagnosis has been assigned. The `false-negative rate' is the proportion of false-negative cases among all cases. The `false-positive rate' is the proportion of false-positive cases among all cases. In principle, false-negative and false-positive rates may be obtained both for manual coding as well as for the various methods of autocoding. Unfortunately, obtaining these rates requires that each case be examined by a human coding expert, and the correct codes determined for that case. From this set of `true positive' codes, a computer program can determine whether a particular case has been correctly assigned by manual or various automated methods. Most pathology laboratories cannot devote the human resources necessary to determine the exact set of true-positive codes for their caseloads.

For retrieval problems, the most important information is the false-negative rate for the autocoder. This is the proportion of cases in which the autocoder fails to assign a correct code needed for retrieval. If the autocoder has, say, a 10% false-negative rate, this means that, on average, 10% of cases desired in a particular retrieval request will not be recovered. The false-positive rate, namely the proportion of unwanted cases that will be recovered, can be regarded as a nuisance-factor, which only becomes important if it is very large. For example, when one performs a MEDLINE literature search, one typically detects numerous unwanted citations; but these can easily be bypassed at a glance. The desired citations which are not detected (false-negatives) is the more vexing aspect of a literature search.

For the present investigation, we assumed initially that the manual coding for each case contained no false-negatives for major diagnoses. That is, we assumed that the major sense of the case was always captured manually. We then reviewed every case in which a major diagnosis from manual coding had been missed by the enhanced autocoder. The list of `major missed diagnoses' was obtained as follows: First, we assembled a list of `minor diagnoses', such as `M09450 NO EVIDENCE OF MALIGNANCY', `M00100 NORMAL TISSUE MORPHOLOGY, NOS', as well as non-specific inflammation, such as `M41000 INFLAMMATION, ACUTE, NOS', `M43000 INFLAMMATION, CHRONIC, NOS', etc. A minor diagnosis in the manual coding was not required to find a match in the autocoder diagnoses. Second, a list of near-synonyms was assembled, such as `M81400 ADENOMA' near-synonym for `M82110 `TUBULAR ADENOMA'. A major diagnosis in the manual coding was considered matched if its near-synonym appeared in the autocoder diagnoses. Finally, a match was only required in the first three digits of the SNOMED-code (where the first digit is either `M' or `T'). Thus, `M72000 HYPERPLASIA' was considered a match for `M72400 HYPERPLASIA, GLANDULAR AND STROMAL'.


4. RESULTS.


A total of 9,353 cases was examined over the 33-month duration of the study. In the first pass of the enhanced autocoder, 463 (5%) discrepant cases were detected, in which a major diagnosis in the manual coding had been missed by the enhanced autocoder. These cases were reviewed by an experienced human coder, who assigned true-positive codes for each case, based solely upon the information available in the source-text-stream available to the autocoder. In many of the initially discrepant cases, manually-entered codes were based on clinical information not present in the `specimen source' or `final microscopic diagnosis' sections of the report, and thus were inaccessible to the autocoder. In some cases, manually-entered codes were based on misspelled words in the source or diagnosis sections. Again, these manually-entered codes could not reasonably be detected by the autocoder, and were removed from the list of true-positive manual codes. In rare cases, the manually-entered codes were simply wrong. The final set of true-positive diagnoses assigned to the initially discrepant cases, was passed through the enhanced autocoder again. In this second pass through the autocoder, there was a missing, major, true-positive diagnosis in only 44 (0.5%) cases. This result suggests that a well-maintained autocoder can determine the major diagnoses in 99.5% of cases with no data-entry errors, but in our service, an additional 4.5% of cases had major, missed diagnoses due to data entry errors in the free text fields scanned by the autocoder.

Table 1 shows a distribution of the 25 most common, distinct morphology codes obtained by manual coding, ranked in descending frequency of occurrence, and accounting for 9,156 (68.1%) of all diagnoses made in the period of study. The most common manual diagnosis was `M43000 INFLAMMATION, CHRONIC', present in 778 (8.3%) of cases. The 25 most common diagnoses are characteristic of our patient population, consisting predominantly of middle-aged men. Table 2 shows a distribution of the 25 most common, distinct morphology codes obtained by the enhanced autocoder, ranked in descending frequency of occurrence, and accounting for 14,498 (61.1%) of all enhanced autocoder diagnoses in the period of study. The most common enhanced autocoder diagnosis was `M09450 NO EVIDENCE OF MALIGNANCY', present in 1,564 (16.7%) of all cases. The other common diagnoses obtained by the autocoder are similar to manual codes, except that the autocoder appears to be more complete in assigning minor diagnoses.

Table 3 and Table 4 summarize the behavior of manual coding, the simple autocoder, and the enhanced autocoder, for morphology and topography codes. In both cases, it is apparent that the simple autocoder obtains a poor result compared to the enhanced autocoder, whereas the enhanced autocoder has behavior quite similar to manual coding. For example, the simple autocoder obtains almost three times as many morphology codes per case as the enhanced autocoder, because the simple autocoder assigns many words in the specimen source or final microscopic diagnosis to nonsense SNOMED codes. The most common morphology code assigned (erroneously) by the simple autocoder was `M14070 WOUND, BIOPSY', because the word `biopsy' appears in many specimen source texts. The simple autocoder does not have the one-word term, `biopsy', in its dictionary, and thus takes the two-word term, `WOUND, BIOPSY', which includes the word `biopsy'.

In a surgical pathology service with a stable patient population, a few diagnoses and a few specimen sites should account for a majority of the specimens seen. As shown in Table 2, the `median morphology code' (i.e., the 50-percentile morphology code representing the halfway point in the morphology code ranking) for manual coding occurs at rank 14. This means that at least 50% of all manual morphology codes are covered by the 14 most frequent (i.e., highest-ranking) diagnoses. The `80-percentile morphology code' for manual coding occurs at rank 42. This means that at least 80% of all manual morphology codes are covered by the 42 most frequent diagnoses. Finally, at least 90% of all manual morphology codes are covered by the 89 most frequent diagnoses. A similar distribution of percentiles is seen for morphology codes assigned by the enhanced autocoder, but a much more heterogeneous percentile-ranking is obtained by the simple autocoder. Analogously, topography coding is fairly narrow for manual coding and the enhanced autocoder, but more heterogeneous for the simple autocoder (Table 4).


5. DISCUSSION.


The nomenclature for automatic coding is somewhat vague. The term `computer-assisted coding' has been used to refer to a variety of distinctly different activities. Our impression is that the term `computer-assisted coding' describes a system where the person entering data is prompted by the computer to enter the name of a topographic site or morphologic entity. The computer then points to a matching entry, if any, in the SNOMED file. If there is a match, then the computer reports the code number assigned to the matching file entry. If there is no match, then the user is prompted to enter another morphologic diagnosis or topography. Such a system is currently used in Veterans Affairs Medical Centers. It is our experience that most pathologists regard this form of coding as `manual' coding, since the pathologist must manually re-enter the specimen source and final microscopic diagnoses for every specimen. This system is faster than searching for diagnoses in the SNOMED books, but is not as fast as having the computer extract codes from the free text report. Another problem with coding based on searching a computer dictionary is that there is seldom a `browse' mode that permits the user to search for an optimal diagnostic term. After a few input terms are returned unmatched, all but the most devoted coder will settle for a `generic' diagnosis that broadly includes the lesion of interest. For example, the pathologist may yield to the temptation of diagnosing every non-neoplastic skin condition under the term `inflammation'. In our opinion, `computer-assisted coding' would also include systems where the user must enter simplified terminology for diagnosis or topography into specified data fields.

We use the term `automated coding' to describe systems in which the computer does all of the work of coding, with no user interaction. In these systems, the pathology report is written with no special regard for the coding process that will follow. The computer scans the entire report or that portion of the report designated to contain diagnostic information. Sentences are `parsed' by context-sensitive grammatical rules into phrases. These phrases are matched against entries in an electronic dictionary that may or may not be enhanced from the raw dictionary supplied by the coding system (e.g., SNOMED, [3]
ICD, [15] Mesh, [16] Read, [17] etc.)

In the current study, an automated coder (`autocoder') read and coded 9,353 surgical pathology reports that had previously been coded using the standard Veterans Affairs computer-assisted SNOMED coding package. Automatic coding was performed by two different methods: simple coding, in which the coder simply reads the consecutive words of the report and searches for match-words in the coding dictionary; and enhanced searching, in which the coder reads through the report, parses the text into phrases, and matches phrases against a dictionary that had been enhanced to include not only SNOMED terms, but related terms pointing to SNOMED terms. For instance, `vulva' would point to `vulvar', so that either `vulvar carcinoma' or `carcinoma of the vulva' would match the same SNOMED code.

Measuring the quality of coding is a difficult task, and doubtless the complexities have contributed to the lack of scientific literature available in this area. To a large extent, the quality of coding is determined by the intended purpose of the coding database. At present there are four popular coding databases available to pathology departments for indexing and retrieving reports by diagnostic and topographic content and currently in wide use in the USA. These are: SNOMED [3], ICD-9 [15], and MESH [16]. A fourth coding system, the Read system, [17] is used primarily in Great Britain.

SNOMED provides codes for seven dimensions of report descriptors, including topography, morphology, etiology, function, disease, procedure, and occupation. Our experience has been that most pathology departments typically code under Morphology and Topography and ignore the other descriptors. In theory, SNOMED is a six-digit hierarchical system, with the most general terms described by the first two digits and more specific information carried by the succeeding three digits. The problem with hierarchical systems is that one person's concept of topographic or morphologic hierarchy may not fit another person's concept. A single disease entity such as a decubitus ulcer of the may be coded under a number of different morphology codes, including `decubitus' (M10540), `ulcer' (M38000), `inflammation' (M40000), `inflammation, chronic ulcerative' (M43030), `inflammation, necrotizing' (M40700), or `inflammation, ulcerative' (M40030). The topography codes for a decubitus ulcer might include skin (T01000), skin of thigh (T02810), skin of posterior surface of thigh (T02812). These morphology codes exemplify the partially non-hierarchical character of SNOMED. If the pathologist codes the case as decubitus, a search under the term for ulcer or for chronic ulcerative inflammation would not recover the case. Furthermore, a hierarchical search under the three-digit leader either for decubitus (M10), ulcer (M38), or chronic ulcerative inflammation (M43), would fail to recover the case coded under either of the alternate morphology listings. The same is true of the topography code, as a code under the leading 3-digit string for skin (T01) would fail to recover cases listed for the leading string of skin of thigh (T02). In order to assure recovery of the case, the pathologist would need to code under all applicable morphology and topography codes, a prodigious undertaking. An additional drawback of SNOMED is its strictly proprietary nature. As a commercial product owned by the College of American Pathologists, all SNOMED users must purchase licensed copies of the code dictionary. This makes it difficult for software developers to market their automatic coders as a complete package including the SNOMED dictionary, especially if they wish to expand the dictionary with synonym and misspelling pointers.

Our confusion with these aspects of SNOMED coding is reflected in the complex strategy that we finally settled upon for comparing manual coding to results of the enhanced autocoder. First, we assembled a list of `minor diagnoses', such as `M09450 NO EVIDENCE OF MALIGNANCY', which were not required to find a match among the autocoder diagnoses. Second, a list of near-synonyms was assembled, such as `M81400 ADENOMA' near-synonym for `M82110 `TUBULAR ADENOMA', in which the manual coding was considered matched if its near-synonym appeared among the autocoder diagnoses. Third, a match was only required in the first three digits of the SNOMED-code, so that, say, `M72000 HYPERPLASIA' was considered a match for `M72400 HYPERPLASIA, GLANDULAR AND STROMAL'. Finally, we found it necessary to have a `dictionary policeman', who reviewed all new encounters with previously unused phrases occurring in our natural language text file, and pointed these phrases to appropriate SNOMED codes. By contrast, a `simple autocoder', which employs a direct word-match between the source text and the SNOMED dictionary, performed quite poorly. As shown in Table 3, the simple autocoder obtained a heterogeneous distribution of codes. Many of these code-assignments were nonsense, because the simple autocoder assigns many words in the specimen site or final microscopic diagnosis to SNOMED codes which fortuitously happen to contain those words (e.g., BIOPSY pointed to WOUND, BIOPSY); and the simple autocoder fails to assign codes for slight word variations (e.g., ASCITES not pointed to ASCITIC FLUID).

Remarkably, the enhanced automated SNOMED coding strategy resulted in only 0.5% missed major SNOMED codes by the autocoder as compared to the spell-corrected manual codes. These missed major codes were the result of complex syntax in the source text stream, which would require a sophisticated parsing algorithm. [13] This result suggests that perfect orthography in the source text and vigilant dictionary maintenance are sufficient to achieve highly accurate coding. Complex parsing algorithms, available in computer translators such as TRANSOFT, could not be expected to increase coding accuracy to an appreciable extent.

The Medical Subject Headings (MeSH) codes of the United States National Library of Medicine has been used as a universal language and code dictionary for medical text. [16] Moore et al matched MeSH terms to pathology text words and phrases from narrative text of 4,591 autopsy reports from The Johns Hopkins Hospital. [14] This matching permits computerized searches through the autopsy database by MeSH term. The MeSH term code dictionary has three important advantages over SNOMED. First, all MeSH terms are keyed to the National Library of Medicine on-line databases, assuring that coded items from the departmental database will be acceptable Medline search topics. Secondly, MeSH terms permit single entities to be coded under more than one hierarchy, and compensates for redundancies by adding pointers between redundant codes. For instance, `cystic fibrosis' can be regarded as a neonatal disease (C16.614.213), as a pulmonary disease (C8.381.187), or as a pancreatic disease (C6.689.202). In the MeSH system, redundancies are connected by dictionary pointers, so that a search for `cystic fibrosis' under any of the three codes will point to the other code alternates. Thirdly, the National Library of Medicine permits software developers to use MeSH freely in indexing applications. This means that commercial coding applications may encapsulate the MeSH dictionary in their distributed products. The major disadvantage of MeSH is that its nomenclature lacks the detail and scope of SNOMED.

The International Statistical Classification of Diseases, Injuries, and Causes of Death, ninth edition (ICD-9) was constructed primarily to support statistical studies of the diseases occurring in health care regions. [15] More recently, ICD-9 codes have been linked to DRG (Diagnosis Related Groups). The relative value of a coding language depends upon the intended purpose of the coded database.

Pathologists may code with the intention of optimizing their chances of recovering the case at some later time. A pathologist may choose to code a single case of vocal cord dysplasia under multiple related morphologic or topographic terms to insure the success of some future search (e.g., cytologic atypia, precancer, dysplasia, carcinoma in situ, squamous carcinoma, vocal cord, larynx, neck). An epidemiologist trying to determine the respective incidences of cord dysplasia and cord carcinoma may be perplexed by the many code listings for a single biopsy specimen. We find it interesting that no specific strategy for coding has been offered to pathologists or to vendors of coding software telling us whether we should be choosing the a single `best fit' diagnosis for a lesion or whether we should assure inclusivity of coding with multiple related terms. This question will have greater relevance when administrators and epidemiologists attempt to use collected code databases.

In summary, our findings support the following conclusions:
           (1) fully automatic SNOMED coding is a practical alternative to manual SNOMED coding;
           (2) automated SNOMED coding of 9,353 surgical pathology reports at the Baltimore VA Medical Center was superior to manual coding in several measurable categories, including the overall number of codes generated and the number of distinct code entities provided;
           (3) departments can improve automated SNOMED coding by writing reports in a clear and unambiguous style; by enforcing correct orthography; and by expanding the (electronic) code dictionary with terms (synonyms) used in the department but not contained in the formal SNOMED nomenclature;
           and (4) departments may monitor automated coding as a regular quality assurance activity leading to improved patient care.
Table 5 summarizes suggested guidelines for a QA monitor pathology departments may use to evaluate and improve autocoder performance.

The overall evaluation of coding activities requires a clear understanding of the purposes of coding. Currently, coding in pathology departments is done primarily so that reports of a certain lesion or location can be recovered by the pathologist. In the near future, coding activities may relate more closely to broader questions of regional, national, and international importance. Once uses of coded reports become prioritized, and an optimal coding dictionary can be chosen. Additionally, coding algorithms can be designed to minimize errors based on the intended uses of the codes.


TABLE 1. SNOMED MORPHOLOGY CODES FOR
9,353 CONSECUTIVE PATHOLOGY REPORTS:
25 MOST FREQUENT MANUAL CODES.


           NUMBER   CUMULATIVE  SNOMED   DESCRIPTION
          OF CASES    NUMBER     CODE
   1          778       778     M43000   Inflammation, chronic, NOS
   2          776      1554     M41000   Inflammation, acute, NOS
   3          742      2296     M72000   Hyperplasia
   4          619      2915     M00100   Normal tissue morphology, NOS
   5          615      3530     M81403   Adenocarcinoma, NOS
   6          571      4101     M09450   No evidence of malignancy
   7          492      4593     M80903   Basal cell carcinoma, NOS
   8          396      4989     M80703   Squamous cell carcinoma, NOS
   9          376      5365     M40000   Inflammation
  10          356      5721     M82110   Tubular adenoma, NOS
  11          321      6042     M54000   Necrosis
  12          318      6360     M72400   Hyperplasia, glandular an
  13          285      6645     M51100   Cataract, NOS
  14          282      6927     M38000   Ulcer
  15          269      7196     M72040   Hyperplasia, polypoid
  16          258      7454     M72600   Hyperkeratosis, NOS
  17          251      7705     M49000   Fibrosis
  18          209      7914     M45020   Granulation tissue, NOS
  19          208      8122     M72750   Keratosis, seborrheic
  20          192      8314     M72850   Keratosis, actinic
  21          187      8501     M33410   Cyst, epithelial inclusion
  22          186      8687     M09460   Negative for tumor cells
  23          165      8852     M31680   Hernia sac
  24          165      9017     M58000   Atrophy
  25          139      9156     M30000   Calculus



TABLE 2. SNOMED MORPHOLOGY CODES FOR
9,353 CONSECUTIVE PATHOLOGY REPORTS:
25 MOST FREQUENT ENHANCED AUTOCODER CODES.


           NUMBER   CUMULATIVE  SNOMED   DESCRIPTION
          OF CASES    NUMBER     CODE
   1         1564      1564     M09450   No evidence of malignancy
   2         1397      2961     M00100   Normal tissue morphology, NOS
   3          915      3876     M72400   Hyperplasia, glandular and stromal
   4          856      4732     M43000   Inflammation, chronic, NOS
   5          817      5549     M09010   Tissue insufficient for diagnosis
   6          761      6310     M41000   Inflammation, acute, NOS
   7          686      6996     M76800   Polyp
   8          601      7597     M81403   Adenocarcinoma, NOS
   9          540      8137     M40000   Inflammation
  10          527      8664     M38000   Ulcer
  11          525      9189     M54000   Necrosis
  12          522      9711     M49000   Fibrosis
  13          509     10220     M80903   Basal cell carcinoma, NOS
  14          460     10680     M01100   Lesion, NOS
  15          460     11140     M42100   Acute and chronic inflammation
  16          457     11597     M82110   Tubular adenoma, NOS
  17          454     12051     M80703   Squamous cell carcinoma, NOS
  18          361     12412     M72040   Hyperplasia, polypoid
  19          360     12772     M69700   Atypia
  20          345     13117     M72600   Hyperkeratosis, NOS
  21          331     13448     M45020   Granulation tissue, NOS
  22          304     13752     M58000   Atrophy
  23          290     14042     M51100   Cataract, NOS
  24          232     14274     M72020   Hyperplasia, secondary
  25          224     14498     M33410   Cyst, epithelial inclusion



TABLE 3. SNOMED MORPHOLOGY CODES FOR
9,353 CONSECUTIVE PATHOLOGY REPORTS:
MANUAL CODING, SIMPLE AUTOCODER, AND ENHANCED AUTOCODER.


                         MANUAL          SIMPLE            ENHANCED
                         CODING          AUTOCODER         AUTOCODER
                               
# of morphology          13,454           66,865             23,744
codes
                                 
average # of              1.4               7.1                2.5
morphology codes
per specimen
                           
# of morphologic          519              1,130               498
entities
                               
# of unique
morphologic entities      209                248               129
                        
# (%) of specimens
with most common       778 (8.3%)       5,689 (60.8%)       1,564 (16.7%)
morphology code
                                 
rank of the 50-percentile
morphology code            14                29                  17
                          
rank of the 80-percentile
morphology code            42               127                  58
                                       
rank of the 90-percentile
morphology code            89               238                 102
Rank is determined by the frequency of occurrence of the code. For example, a rank for the 80-percentile code of 42 means that the 42 most common codes accounted for at least 80% of all the code entries.



TABLE 4. SNOMED TOPOGRAPHY CODES FOR
9,353 CONSECUTIVE PATHOLOGY REPORTS:
MANUAL CODING, SIMPLE AUTOCODER, AND ENHANCED AUTOCODER.


                         MANUAL          SIMPLE            ENHANCED
                         CODING          AUTOCODER         AUTOCODER
                        
# of topography          10,235           16,409             24,328
codes
                      
average # of              1.1              1.8                 2.6
topography codes
per specimen
                      
# of topographic          404              949                 602
entities
                    
# of unique
topographic entities      142              284                 196

# (%) of specimens
with most common     2,023 (21.6%)       941 (10.1%)       2,386 (25.5%)
topography code

rank of the 50-percentile
topography code              6               35                  16

rank of the 80-percentile
topography code             38              154                  75

rank of the 90-percentile
topography code             86              285                 136


Rank is determined by the frequency of occurrence of the code. For example, a rank for the 80-percentile code of 38 means that the 38 most common codes accounted for at least 80% of all the code entries.



TABLE 5. SUGGESTED GUIDELINES FOR QUALITY ASSURANCE
OF A DEPARTMENTAL SNOMED AUTOCODER.

1. Assemble a representative subset of cases with manual and autocoder SNOMED diagnoses.

2. Compare the coding results and compile:
a. a list of diagnostic terms used in your department that were missed by the autocoder.

b. a list of `minor diagnoses' (negatives, non-specific inflammation), which the autocoder is not required to detect.

c. a list of synonyms, which should be regarded as equivalent between manual coding and autocoder.

d. a list of discrepant cases, in which a major site or diagnosis in the initial manual coding has no match (in the first three digits) and no synonym among the autocoder sites and diagnoses.


3. Update the autocoder dictionary, based on any deficiencies detected in step #2.

4. Suggest changes in report syntax based on findings in step #2.

5. Repeat coding monitor until improvement plateaus.


11. REFERENCES.



1. Erlander D.
Computer data processing of medical diagnoses in pathology.
Am J Clin Pathol. 1975; 63:538-544.

2. Coles EC, Slavin G.
An evaluation of automatic coding of surgical pathology reports.
J Clin Pathol 1976;29:621-625.

3. College of American Pathologists.
Systematized nomenclature of medicine (SNOMED).
Skokie, IL: College of American Pathologists, 1976;:.

4. Cote RA, Robboy S.
Progress in Medical Information Management: systematized nomenclature of medicine (SNOMED).
JAMA. 1980;243:756-762.

5. Hall PA, Lemoine NR.
Comparison of manual data coding errors in two hospitals.
J Clin Pathol. 1986;39:622-626.

6. Earlam R.
Korner, nomenclature and SNOMED.
Brit Med J. 1988;296:903-905.

7. Dodd W.
Korner, nomenclature and SNOMED (letter).
Brit Med J. 1988;296:1198-1199.

8. Sorace JM, Carnahan GW, Moore GW, Berman JJ.
Automated review of blood donor screening test patterns at a regional blood center.
Am J Clin Pathol 1992; 98:334-344.

9. Furfine CS.
The FDA's policy on the regulation of computerized medical devices.
M.D. Computing 1992;9:97-100.

10. Brannigan VM.
Software quality regulation under the Safe Medical Devices Act of 1990: Hospitals are now the canaries in the software mine.
Proc Annu Symp Compu Applic Med Care 1991;15:238-242.

11. Davis RG.
FileMan: A User Manual.
Bethesda, MD: National Association of VA Physicians. 1987;:.

12. Moore GW, Berman JJ.
Object-oriented English-to-Snomed translator using TRANSOFT+HyperPAD.
Symposium on Computer Applications in Medical care. 1991; 15:973-975.

13. Moore GW, Wakai I, Satomura Y, Giere W.
TRANSOFT: Medical translation expert system.
Artif Intell Med. 1989;1:149-157.

14. Moore GW, Miller RE, Hutchins GM.
Indexing by MeSH titles of natural language pathology phrases identified on first encounter using the barrier word method.
In: Computerized Natural Medical Language Processing for Knowledge Engineering. Scherrer JR, Cote RA, Mandil SH (eds.), Elsevier Science Publishers, North Holland. 1989;:29-45.

15. The International Classification of Diseases, 9th Revision: ICD-9CM, Second Edition.
U.S. Department of Health and Human Services, Public Health Service, Health Care Financing Administration, U.S. Government Printing Office, 1980;:.

16. National Library of Medicine UMLS Knowledge Source.
US Department of Health and Human Services,
National Institutes of Health, National Library of Medicine, 1991.

17. Read JD, Benson TJR.
Comprehensive coding.
Brit J Health Care Computing 1986; 3:22-25.

12. ADDITIONAL READINGS.

1. Coxeter HSM, Greitzer SL.
Geometry Revisited.
New Mathematical Library.
Washington, DC: Math Assn America. 1967;:.
ISBN: 0883856190, 207 pages.

2. Honsberger R.
Episodes in Nineteenth and Twentieth Century Euclidean Geometry.
New Mathematical Library. Washington DC: Math Assn America. 1996. Second printing. 2005 ;:.
ISBN: 0883856395, 174 pages.

3. Coxeter HSM.
Introduction to Geometry.
New York: John Wiley & Sons, Inc. 1961;:.
Library Congress Catalogue # 72-93903.
SBN: 471-18283.


4. Moore GW, Berman JJ.
Cell growth simulations predicting polyclonal origins for 'monoclonal' tumors.
Cancer Lett. 1991 Nov;60(2):113-119.
PMID: 1933835.
PubMed Entry
Full Text: http://www.netautopsy.org/monoclon.htm
Public-domain open-source code: http://www.netautopsy.org/monoclon.htm#table1
Last tested: July 10, 2009.

5. Berman JJ, Moore GW.
Spontaneous regression of residual tumour burden: prediction by Monte Carlo simulation.
Anal Cell Pathol. 1992 Sep;4(5):359-368.
PMID: 1445794.
PubMed Entry
Full Text: http://www.netautopsy.org/sponregr.htm
Last tested: July 10, 2009.

6. Berman JJ, Moore GW.
The role of cell death in the growth of preneoplastic lesions: a Monte Carlo simulation model.
Cell Prolif. 1992 Nov;25(6):549-557.
PMID: 1457604.
PubMed Entry
Full Text: http://www.netautopsy.org/celdeath.htm
Last tested: July 10, 2009.

8. Moore GW, Berman JJ.
Anatomic Pathology Data Mining.
In: Cios KJ, ed. Medical Data Mining and Knowledge Discovery.
2001. XVIII, 502 pp. 98 figs., 98 tabs. Hardcover.
ISBN: 3-7908-1340-0.
Copyright Springer-Verlag: Berlin/Heidelberg 1999.
Full Text: http://www.netautopsy.org/apdmchap.htm
Last tested: July 10, 2009.

9. Berman JJ.
Tumor classification: molecular analysis meets Aristotle.
BMC Cancer. 2004 Mar 17;4:10.
PMID: 15113444
PubMed Entry
Aristotle (384-322 BCE), Greek philosopher.
This article is among the all-time most-viewed articles in BMC Cancer, and, as of September 2008, has been downloaded about 15,000 times from BiomedCentral.
Last tested: July 10, 2009.

10. Berman JJ.
Tumor taxonomy for the developmental lineage classification of neoplasms.
BMC Cancer. 2004 Nov 30;4(1):88.
PMID: 15571625.
PubMed Entry
Last tested: July 10, 2009.

11. Berman JJ.
Modern classification of neoplasms: reconciling differences between morphologic and molecular approaches.
BMC Cancer 2005, 5:100.
PMID: 16092965
PubMed Entry
Last tested: July 10, 2009.

12. Berman JJ.
Developmental Lineage Classification and Taxonomy of Neoplasms.
http://www.julesberman.info/devclass.htm
Last tested: July 10, 2009.

13. Berman JJ.
Doublet method for very fast autocoding.
BMC Med Inform Decis Mak. 2004 Sep 15;4:16.
PMID: 15369595
PubMed Entry
Last tested: July 10, 2009.

14. Berman JJ.
Resource page.
http://www.julesberman.info/resource.htm
Last tested: July 10, 2009.

15. Berman JJ, Moore GW.
Implementing an RDF schema for pathology images.
http://www.julesberman.info/spec2img.htm
Last tested: July 10, 2009.

16. Berman JJ.
Chronology of Earth.
http://www.julesberman.info/chronos.htm
Last tested: July 10, 2009.

17. Berman JJ.
Biomedical Informatics.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers; 1 edition (October 18, 2006)
ISBN-10: 0763741353, 459 pages.
ISBN-13: 978-0763741358, 459 pages.
http://www.jbpub.com/catalog/9780763741358/
http://www.julesberman.info/
Last tested: July 10, 2009.

18. Berman JJ.
Perl Pogramming for Medicine and Biology.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers; 1 edition (April 6, 2007)
ISBN-10: 076374333X, 407 pages.
ISBN-13: 978-0763743338, 407 pages.
http://www.jbpub.com/catalog/9780763743338/
http://www.julesberman.info/
Last tested: July 10, 2009.

19. Berman JJ.
Perl: The Programming Language.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers. 2009;:.
ISBN: 9780763757588, 52 pages.
http://www.jbpub.com/catalog/9780763757588/
http://www.julesberman.info/
Last tested: July 10, 2009.

20. Berman JJ.
Ruby Programming for Medicine and Biology.
Boston, Toronto, London, Singapore: Jones & Bartlett Pub; 1 edition (September 13, 2007)
ISBN-10: 0763750905, 378 pages.
ISBN-13: 978-0763750909, 378 pages.
http://www.jbpub.com/catalog/9780763750909/
http://www.julesberman.info/
Last tested: July 10, 2009.

21. Berman JJ.
Ruby: The Programming Language.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers. 2009;:.
ISBN: 9780763757571, 46 pages.
http://www.jbpub.com/catalog/9780763757571/
Last tested: July 10, 2009.

22. Berman JJ.
Neoplasms: Principles of Development and Diversity.
Boston, Toronto, London, Singapore: Jones & Bartlett Publishers. 2008 Oct 1.
ISBN: 9780763755706, 464 pages.
http://www.jbpub.com/catalog/9780763755706/
Last tested: July 10, 2009.

23. Berman JJ, with Moore GW.
Precancer: The Beginning and the End of Cancer.
Boston, Toronto, London, Singapore: Jones and Bartlett. 2009 Aug 11;:.
ISBN 9780763777845, 200 pages.
http://www.jbpub.com/catalog/9780763777845/
Last tested: July 10, 2009.

24. Berman JJ.
Web site: http://www.julesberman.info/
Last tested: July 10, 2009.

25. Berman JJ.
Blog site: http://julesberman.blogspot.com/
Last tested: July 10, 2009.

26. Hanahan D, Weinberg RA.
The hallmarks of cancer.
Cell 2000;100:57-70.

27. Kansal AR, Torquato S, Harsh GR IV, Chiocca EA, Deisboeck TS.
Simulated brain tumor growth dynamics using a three-dimensional cellular automaton.
J Theor Biol. 2000 Apr 21;203(4):367-382.
PMID: 10736214.
PubMed Entry
Last tested: July 10, 2009.

28. Kansal AR, Torquato S, Chiocca EA, Deisboeck TS.
Emergence of a subpopulation in a computational model of tumor growth.
J Theor Biol. 2000 Dec 7;207(3):431-441.
PMID: 11082311
PubMed Entry
Last tested: July 10, 2009.

29. Kansal AR, Torquato S.
Globally and locally minimal weight spanning tree networks.
Physica A. 2001;301:601-619.

30. Kansal AR, Trimmer J.
Application of predictive biosimulation within pharmaceutical clinical development: examples of significance for translational medicine and clinical trial design.
Syst Biol (Stevenage). 2005 Dec;152(4):214-220.
PMID: 16986263
PubMed Entry
Last tested: July 10, 2009.

31. Kansal AR.
Modeling approaches to type 2 diabetes.
Diabetes Technol Ther. 2004 Feb;6(1):39-47. Review.
PMID: 15000768.
PubMed Entry
Last tested: July 10, 2009.

32. Kansal AR, Torquato S, Stillinger FH.
Diversity of order and densities in jammed hard-particle packings.
Phys Rev E Stat Nonlin Soft Matter Phys. 2002 Oct;66(4 Pt 1):041109. Epub 2002 Oct 24.
PMID: 12443179.
PubMed Entry
Last tested: July 10, 2009.

33. Deisboeck TS, Berens ME, Kansal AR, Torquato S, Stemmer-Rachamimov AO, Chiocca EA.
Pattern of self-organization in tumour systems: complex growth dynamics in a novel brain tumour spheroid model.
Cell Prolif. 2001 Apr;34(2):115-134.
PMID: 11348426
PubMed Entry
Last tested: July 10, 2009.

34. Kansal AR, Torquato S, Harsh IV GR, Chiocca EA, Deisboeck TS.
Cellular automaton of idealized brain tumor growth dynamics.
Biosystems. 2000 Feb;55(1-3):119-127.
PMID: 10745115
PubMed Entry
Last tested: July 10, 2009.

35. Schmitz JE, et al.
A cellular automaton model of brain tumor treatment and resistance.
J Theor Medicine. 2002(4):223-239.

36. Holash J, et al.
Vessel cooption, regression, and growth in tumors mediated by angiopoietins and VEGF.
Science 1999;284: 1994-1998.

37. Helmlinger G, et al.
Solid stress inhibits the growth of multicellular tumor spheroids.
Nature Biotech. 1997;15:778-783.

38. Kitano H.
Cancer as a robust system: implications for anticancer therapy.
Nat Rev Cancer. 2004 Mar;4(3):227-235. Review.
PMID: 14993904.
PubMed Entry
Last tested: July 10, 2009.

39. Kitano H, Oda K, Kimura T, Matsuoka Y, Csete M, Doyle J, Muramatsu M.
Metabolic syndrome and robustness tradeoffs.
Diabetes. 2004 Dec;53 Suppl 3:S6-S15. Review.
PMID: 15561923.
PubMed Entry
Last tested: July 10, 2009.

40. Kitano H.
Biological robustness.
Nat Rev Genet. 2004 Nov;5(11):826-37. Review.
PMID: 15520792.
PubMed Entry
Last tested: July 10, 2009.

41. Kyoda K, Baba K, Onami S, Kitano H.
DBRF-MEGN method: an algorithm for deducing minimum equivalent gene networks from large-scale gene expression profiles of gene deletion mutants.
Bioinformatics. 2004 Nov 1;20(16):2662-75. Epub 2004 May 27.
PMID: 15166016.
PubMed Entry
Last tested: July 10, 2009.

42. Kitano H.
Cancer robustness: tumour tactics.
Nature. 2003 Nov 13;426(6963):125.
PMID: 14614483.
PubMed Entry
Last tested: July 10, 2009.

43. Gevertz JL, Gillies GT, Torquato S.
Simulating tumor growth in confined heterogeneous environments.
Phys Biol. 2008 Sep 29;5(3):36010.
PMID: 18824788.
PubMed Entry
Last tested: July 10, 2009.

44. Gevertz JL, Torquato S.
A novel three-phase model of brain tissue microstructure.
PLoS Comput Biol. 2008 Aug 15;4(8):e1000152.
PMID: 18704170.
PubMed Entry
Last tested: July 10, 2009.

45. Gevertz JL, Torquato S.
Modeling the effects of vasculature evolution on early brain tumor growth.
J Theor Biol. 2006 Dec 21;243(4):517-531. Epub 2006 Jul 15.
PMID: 16938311.
PubMed Entry
Last tested: July 10, 2009.

46. Conway JH, Torquato S.
Packing, tiling, and covering with tetrahedra.
Proc Natl Acad Sci U S A. 2006 Jul 11;103(28):10612-10617. Epub 2006 Jul 3.
PMID: 16818891.
PubMed Entry
Last tested: July 10, 2009.

47. Deisboeck TS, Berens ME, Kansal AR, Torquato S, Stemmer-Rachamimov AO, Chiocca EA.
Pattern of self-organization in tumour systems: complex growth dynamics in a novel brain tumour spheroid model.
Cell Prolif. 2001 Apr;34(2):115-134.
PMID: 11348426.
PubMed Entry
Last tested: July 10, 2009.

48. Wilimas JA, Dow LW, Douglass EC, Jenkins JJ 3rd, Jacobson RJ, Moohr J, Fialkow PJ.
Evidence for clonal development of Wilms' tumor.
Am J Pediatr Hematol Oncol. 1991 Spring;13(1):26-28.
PMID: 1851399.
PubMed Entry
Last tested: July 10, 2009.

49. Reddy AL, Fialkow PJ.
Evidence that weak promotion of carcinogen-initiated cells prevents their progression to malignancy.
Carcinogenesis. 1990 Dec;11(12):2123-2126.
PMID: 2124950.
PubMed Entry
Last tested: July 10, 2009.

50. Fialkow PJ.
Stem cell origin of human myeloid blood cell neoplasms.
Verh Dtsch Ges Pathol. 1990;74:43-47. Review.
PMID: 1708632.
PubMed Entry
Last tested: July 10, 2009.

51. Fialkow PJ, Singer JW, Raskind WH, Adamson JW, Jacobson RJ, Bernstein ID, Dow LW, Najfeld V, Veith R.
Clonal development, stem-cell differentiation, and clinical remissions in acute nonlymphocytic leukemia.
N Engl J Med. 1987 Aug 20;317(8):468-473.
PMID: 3614291.
PubMed Entry
Last tested: July 10, 2009.

52. Jacobson RJ, Temple MJ, Singer JW, Raskind W, Powell J, Fialkow PJ.
A clonal complete remission in a patient with acute nonlymphocytic leukemia originating in a multipotent stem cell.
N Engl J Med. 1984 Jun 7;310(23):1513-1517.
PMID: 6717542.
PubMed Entry
Last tested: July 10, 2009.

53. Moulton-Levy P, Jackson CE, Levy HG, Fialkow PJ.
Multiple cell origin of traumatically induced keloids.
J Am Acad Dermatol. 1984 Jun;10(6):986-8.
PMID: 6736343.
PubMed Entry
Last tested: July 10, 2009.

54. Reddy AL, Fialkow PJ.
Papillomas induced by initiation-promotion differ from those induced by carcinogen alone.
Nature. 1983 Jul 7-13;304(5921):69-71.
PMID: 6408484.
PubMed Entry
Last tested: July 10, 2009.

55. Fialkow PJ, Singer JW, Adamson JW, Berkow RL, Friedman JM, Jacobson RJ, Moohr JW.
Acute nonlymphocytic leukemia: expression in cells restricted to granulocytic and monocytic differentiation.
N Engl J Med. 1979 Jul 5;301(1):1-5.
PMID: 286882.
PubMed Entry
Last tested: July 10, 2009.

56. Fialkow PJ.
Clonal origin of human tumors.
Annu Rev Med. 1979;30:135-143. Review.
PMID: 400484.
PubMed Entry
Last tested: July 10, 2009.

57. Fialkow PJ, Najfeld V, Reddy AL, Singer J, Steinmann L.
Chronic lymphocytic leukaemia: Clonal origin in a committed B-lymphocyte progenitor.
Lancet. 1978 Aug 26;2(8087):444-6.
PMID: 79806.
PubMed Entry
Last tested: July 10, 2009.

58. Adamson JW, Fialkow PJ.
The pathogenesis of myeloproliferative syndromes.
Br J Haematol. 1978 Mar;38(3):299-303. Review.
PMID: 346048.
PubMed Entry
Last tested: July 10, 2009.

59. Fialkow PJ, Jackson CE, Block MA, Greenawald KA.
Multicellular origin of parathyroid "adenomas".
N Engl J Med. 1977 Sep 29;297(13):696-698.
PMID: 895789.
PubMed Entry
Last tested: July 10, 2009.

60. Fialkow PJ, Jacobson RJ, Papayannopoulou T.
Chronic myelocytic leukemia: clonal origin in a stem cell common to the granulocyte, erythrocyte, platelet and monocyte/macrophage.
Am J Med. 1977 Jul;63(1):125-130.
PMID: 267431.
PubMed Entry
Last tested: July 10, 2009.

61. Adamson JW, Fialkow PJ, Murphy S, Prchal JF, Steinmann L.
Polycythemia vera: stem-cell and probable clonal origin of the disease.
N Engl J Med. 1976 Oct 21;295(17):913-916.
PMID: 967201.
PubMed Entry
Last tested: July 10, 2009.

62. Barr RD, Fialkow PJ.
Clonal origin of chronic myelocytic leukemia.
N Engl J Med. 1973 Aug 9;289(6):307-309.
PMID: 4515677.
PubMed Entry
Last tested: July 10, 2009.

63. Fialkow PJ, Klein G, Clifford P.
Second malignant clone underlying a Burkitt-tumor exacerbation.
Lancet. 1972 Sep 23;2(7778):629-631.
PMID: 4116779.
PubMed Entry
Last tested: July 10, 2009.

64. Fialkow PJ.
Single or multiple cell origin for tumors?
N Engl J Med. 1971 Nov 18;285(21):1198-1199.
PMID: 5096643.
PubMed Entry
Last tested: July 10, 2009.

65. Fialkow PJ.
Is lyonisation total in man?
Lancet. 1970 Aug 8;2(7667):315.
PMID: 4194398.
PubMed Entry
Last tested: July 10, 2009.

66. Fialkow PJ, Klein G, Gartler SM, Clifford P.
Clonal origin for individual Burkitt tumours.
Lancet. 1970 Feb 21;1(7643):384-386.
PMID: 4189689.
PubMed Entry
Last tested: July 10, 2009.

67. Carter JR.
The office of decedent affairs.
JAMA. 1992 Jan 8;267(2):235-236.
PMID: 1727517
PubMed Entry
Last tested: July 10, 2009.

68. Carter JR.
The problematic death certificate.
N Engl J Med. 1985 Nov 14;313(20):1285-1286.
PMID: 4058510
PubMed Entry
Last tested: July 10, 2009.

69. Kircher T, Carter JR, Sinton E.
The National Autopsy Data Bank.
Pathologist. 1985 Nov;39(11):22-26.
PMID: 10274305
PubMed Entry
Last tested: July 10, 2009.

70. Carter JR, Nash NP, Cechner RL, Platt RD.
Proposal for a national autopsy data bank: a potential major contribution of pathologists to the health care of the nation.
Am J Clin Pathol. 1981 Oct;76(4 Suppl):597-617.
PMID: 7282646
PubMed Entry
Last tested: July 10, 2009.

71. Carter JR.
National autopsy data bank: potentially useful to so many, for so little.
Pathologist. 1981 Oct;35(10):548-553.
PMID: 10253253
PubMed Entry
Last tested: July 10, 2009.

72. Carter JR.
A renascence role of anatomic pathology in modern medicine.
Hum Pathol. 1977 May;8(3):237-241.
PMID: 323134
PubMed Entry
Last tested: July 10, 2009.

73. Cechner RL, Carter JR.
Storage and retrieval of SNOP-coded pathologic diagnoses using offsite computing and optical character recognizing systems.
Am J Clin Pathol. 1976 May;65(5):654-61.
PMID: 16535807
PubMed Entry
Last tested: July 10, 2009.

74. Peery TM.
The Autopsy Data Bank: a proposal for pathologists to contribute to the health care of the nation.
Am J Clin Pathol. 1978 Feb;69(2 Suppl):258-259.
PMID: 626172
PubMed Entry
Last tested: July 10, 2009.

75. Williams MJ, Peery TM.
The autopsy, a beginning, not an end.
Am J Clin Pathol. 1978 Feb;69(2 Suppl):215-216.
PMID: 626160
PubMed Entry
Last tested: July 10, 2009.

76. Moore GW, Hutchins GM.
The persistent importance of autopsies.
Mayo Clin Proc. 2000 Jun;75(6):557-558.
PMID: 10852414
PubMed Entry
Last tested: July 10, 2009.

77. Hutchins GM, Berman JJ, Moore GW, Hanzlick R.
Practice guidelines for autopsy pathology: autopsy reporting. Autopsy Committee of the College of American Pathologists.
Arch Pathol Lab Med. 1999 Nov;123(11):1085-92.
PMID: 10539932
PubMed Entry
Last tested: July 10, 2009.

78. Berman JJ, Moore GW, Hutchins GM.
Internet autopsy database.
Hum Pathol. 1997 Apr;28(4):393-394.
PMID: 9104935
PubMed Entry
Last tested: July 10, 2009.

79. Moore GW, Berman JJ, Hanzlick RL, Buchino JJ, Hutchins GM.
A prototype Internet autopsy database. 1625 consecutive fetal and neonatal autopsy facesheets spanning 20 years.
Arch Pathol Lab Med. 1996 Aug;120(8):782-785.
PMID: 8718907
PubMed Entry
Last tested: July 10, 2009.

80. Berman JJ, Moore GW, Hutchins GM.
Maintaining patient confidentiality in the public domain Internet Autopsy Database (IAD).
Proc AMIA Annu Fall Symp. 1996:328-332.
PMID: 8947682
PubMed Entry
Last tested: July 10, 2009.

81. Baumann RP, Moore GW.
[Comparison of 2 series of autopsies observed at Johns-Hopkins Medical Center, Baltimore (JHMI) and at the Neuchâtel Institute of Pathology (INAP)]
Schweiz Med Wochenschr. 1990 Dec 8;120(49):1876-1879. [French].
PMID: 2263930
PubMed Entry
Last tested: July 10, 2009.

82. Moore GW, Miller RE, Hutchins GM.
Determining cause of death in 45,564 autopsy reports.
Theor Med. 1988 Jun;9(2):179-186.
PMID: 3413706
PubMed Entry
Last tested: July 10, 2009.

83. Moore GW, Boitnott JK, Miller RE, Eggleston JC, Hutchins GM.
Integrated pathology reporting, indexing, and retrieval system using natural language diagnoses.
Mod Pathol. 1988 Jan;1(1):44-50.
PMID: 3070549
PubMed Entry
Last tested: July 10, 2009.

84. Moore GW, Hutchins GM, Miller RE.
Strategies for searching medical natural language text. Distribution of words in the anatomic diagnoses of 7000 autopsy subjects.
Am J Pathol. 1984 Apr;115(1):36-41.
PMID: 6546837
PubMed Entry
Last tested: July 10, 2009.

85. Kleiner DE, Emmert-Buck MR, Liotta LA.
Necropsy as a research method in the age of molecular pathology.
Lancet. 1995 Oct 7;346(8980):945-948. Review.
PMID: 7564732
PubMed Entry
Last tested: July 10, 2009.

86. Hall PA, Lemoine NR.
Comparison of manual data coding errors in two hospitals.
J Clin Pathol. 1986 Jun;39(6):622-626.
PMID: 3722414
PubMed Entry
Last tested: July 10, 2009.

87. Coles EC, Slavin G.
An evaluation of automatic coding of surgical pathology reports.
J Clin Pathol. 1976 Jul;29(7):621-625.
PMID: 977772
PubMed Entry
Last tested: July 10, 2009.

88. Earlam R.
Körner, nomenclature, and SNOMED.
Br Med J (Clin Res Ed). 1988 Mar 26;296(6626):903-905.
PMID: 3129068
PubMed Entry
Last tested: July 10, 2009.

89. Dodd W.
Körner, nomenclature, and SNOMED.
Br Med J (Clin Res Ed). 1988 Apr 23;296(6630):1198-1199.
PMID: 3132268
PubMed Entry
Last tested: July 10, 2009.

90. Earlam R.
Surgical audit in a district general hospital: a stimulus for improving patient care.
Ann R Coll Surg Engl. 1987 Sep;69(5):251-252.
PMID: 3674693
PubMed Entry
Last tested: July 10, 2009.

91. Brannigan VM.
Software quality regulation under the Safe Medical Devices Act of 1990: hospitals are now the canaries in the software mine.
Proc Annu Symp Comput Appl Med Care. 1991:238-242.
PMID: 1807596
PubMed Entry
Last tested: July 10, 2009.

92. Moore GW, Berman JJ.
Object-oriented controlled-vocabulary translator using TRANSOFT + HyperPAD.
Proc Annu Symp Comput Appl Med Care. 1991:973-975. Links
PMID: 1807773.
PubMed Entry
Last tested: July 10, 2009.

93. Stuart-Buttle CD, Brown PJ, Price C, O'Neil M, Read JD.
The Read Thesaurus--creation and beyond.
Stud Health Technol Inform. 1997;43 Pt A:416-420.
PMID: 10184896
PubMed Entry
Last tested: July 10, 2009.

94. Stuart-Buttle CD, Read JD, Sanderson HF, Sutton YM.
A language of health in action: Read Codes, classifications and groupings.
Proc AMIA Annu Fall Symp. 1996:75-79.
PMID: 8947631
PubMed Entry
Last tested: July 10, 2009.

95. Read JD, Sanderson HF, Drennan YM.
Terming, encoding, and grouping.
Medinfo. 1995;8 Pt 1:56-59.
PMID: 8591263
PubMed Entry
Last tested: July 10, 2009.



Last updated: 7/10/2009, by G. William Moore, MD, PhD.