You are seeing this message because your Web browser does not support basic Web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.


ABOUT ARCHIVES
Advanced Search

Welcome   | My Account | E-mail Alerts | Access Rights | Sign In


  Vol. 136 No. 4, April 2000 TABLE OF CONTENTS
  Archives
  •  Online Features
  Study
 This Article
 •Abstract
 •PDF
 •Send to a friend
 • Save in My Folder
 •Save to citation manager
 •Permissions
 Citing Articles
 •Citation map
 •Citing articles on HighWire
 •Citing articles on ISI (21)
 •Contact me when this article is cited
 Related Content
 •Related articles
 •Similar articles in this journal
 Topic Collections
 •Dermatology, Other
 •Alert me on articles by topic

Accuracy, Concordance, and Reproducibility of Histologic Diagnosis in Cutaneous T-Cell Lymphoma

An EORTC Cutaneous Lymphoma Project Group Study

Marco Santucci, MD; Annibale Biggeri, MD; Alfred C. Feller, MD; Günter Burg, MD; for the European Organization for Research and Treatment of Cancer (EORTC) Cutaneous Lymphoma Project Group

Arch Dermatol. 2000;136:497-502.

ABSTRACT

Objective  To assess the level of observer variability in the histologic identification of cutaneous T-cell lymphoma (CTCL) and its discrimination from diseases with similar histologic features.

Design  Cutaneous T-cell lymphoma specimens and randomly mixed controls were evaluated twice by 3 examiners.

Settings  The European Organization for Research and Treatment of Cancer (EORTC) Cutaneous Lymphoma Project Group.

Patients  The study was conducted with histologic specimens from 32 patients with mycosis fungoides (MF). In addition, 13 specimens of spongiotic, lichenoid, or psoriasiform simulators of MF were blindly and randomly mixed with the CTCL specimens as controls.

Main Outcome Measures  To evaluate the accuracy and concordance among and individual reproducibility of raters of histologic diagnoses.

Results  Overall, the concordance among raters was fair to moderate (range, 0.283-0.562; weighted overall {kappa}, 0.412). Individual reproducibility of examiners ranged from moderate to almost perfect (range, 0.473-0.896; weighted overall {kappa}, 0.709) and was not significantly different for the definite lymphoma (range, 0.551-0.921; overall {kappa}, 0.802) and nonlymphoma (range, 0.368-0.950; overall {kappa}, 0.793) categories. Accuracy was similarly variable among raters: sensitivity ranged from 49.3% to 78.1% (overall {kappa}, 0.654), and specificity (control series) ranged from 46.2% to 69.2% (overall {kappa}, 0.595). Adding the diagnoses of probable lymphoma to those of definite lymphoma, sensitivity ranged between 73.5% and 84.9%. Although for each examiner there was a trend toward a lower sensitivity in the detection of early lesions compared with later lesions, the difference in sensitivity between the 2 groups was not statistically significant.

Conclusions  The levels of concordance and reproducibility found in this investigation were similar to those obtained with comparable studies in the most varied fields of pathology, confirming that the identification of CTCL for our observers did not cause particular problems. Our findings also revealed that pitfalls in CTCL identification are not only limited to early lymphomatous lesions, as commonly postulated.



INTRODUCTION
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

RESEARCHERS IN many fields have become increasingly aware of the observer (rater or interviewer) as an important source of measurement error. Of all the sources of data that are analyzed in medicine, human observation is the least standardized. Although the observations made by clinicians, radiologists, or pathologists provide critical information for the diagnosis and treatment of sick people, the observers are seldom subjected to the type of scientific testing that is imposed on inanimate equipment. Only during the past few decades have reliability studies been conducted in experimental or survey situations to assess the level of observer variability in the measurement procedures used in data acquisition, namely when physicians inspect roentgenograms, perform physical examinations, take medical histories, or interpret cytologic and histologic specimens.1

The correct identification of cutaneous T-cell lymphomas (CTCLs) and their proper differentiation from both inflammatory dermatoses and reactive lymphoid hyperplasias often pose vexing challenges, especially when dealing with the initial phases of the lymphomatous process.2-6 Even advanced and experimental diagnostic techniques, such as immunophenotyping, quantitative DNA cytophotometry, and molecular genetic analysis, have proven to be unsuitable in solving the problem. Thus, light microscopy remains the criterion standard.7

The research reported herein was prompted by the discovery, during a European cooperative study on CTCL,8 that many discrepancies had emerged when a series of histologic specimens was classified by a group of dermatopathologists and histopathologists. These discrepancies had at least 2 potential sources: (1) Dermatopathologists and histopathologists may have used different histologic criteria. (2) Regardless of the particular criteria that were used, the different raters may have used the criteria inconsistently.

The object of this study was to explore these possible sources of disagreement. The divergence in criteria (interobserver variability) can be ascertained by having the same specimen interpreted by different observers (concordance), and the inconsistency of single observers (intraobserver variability) can be ascertained by having them interpret the same specimen repeatedly (reproducibility).

This study is unique in many respects. (1) The specimens (slides) used were from patients with complete follow-up data (from the onset of the disease till death), thus leaving no doubt as to the diagnosis. (2) Randomly mixed controls with adequate follow-up data (specimens of eczematous, spongiotic, or psoriasiform simulators of CTCL) were used in the study to exclude possible CTCL in the initial stages of the disease. (3) Interpretation of slides was performed by raters with different professional backgrounds (dermatopathology, hematopathology, or surgical pathology). (4) Raters were blinded (ie, provided with no clinical information or follow-up data) in order to test the actual reliability of histopathologic readings. (5) We assessed the accuracy, concordance, and reproducibility of histologic diagnoses. (6) Statistical analyses were used to correct for chance agreement.9


PATIENTS AND METHODS
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

STUDY POPULATION

The specimens for this study were collected by the European Organization for Research and Treatment of Cancer (EORTC) Cutaneous Lymphoma Project Group. The referring physicians contributed specimens for which histologic material was available from the beginning of the disease. The specimens were taken from a series of 32 patients (total slides, 73; mean biopsy specimens per patient, 2.3; range, 1-3 biopsy specimens), all of whom had complete clinical information (ie, age of lesions, staging, treatment, and duration of the disease) and follow-up data. For all of these patients, the diagnosis of lymphoma was unequivocally established by clinical events, namely, later development of plaques, nodules, or tumors, and/or death caused by lymphoma. In addition, 13 specimens (slides) of spongiotic, lichenoid, or psoriasiform simulators of mycosis fungoides (MF) were blindly and randomly added to the CTCL specimens to serve as controls. This was done by a person who did not take part in the histologic evaluation.

Controls were obtained from the files of the Department of Dermatology of the University of Würzburg, Germany, and selected according to the following 3 criteria: (1) The histologic features of the specimens were highly suspicious for or indistinguishable from early MF. (2) The clinical differential diagnosis did not include MF in any of these cases. (3) Long-term follow-up documented the absence of progressive disease, including the development of lesions suspicious for MF either clinically or histologically. The final diagnoses of the control specimens were as follows: allergic contact dermatitis (4 cases), drug eruption (4 cases), lichen striatus (2 cases), erythema multiforme (2 cases), and psoriasis (1 case). All specimens were large wedge biopsy samples; the tissue fragments were fixed in buffered formalin, routinely processed, and stained with hematoxylin-eosin and Giemsa stain.

HISTOPATHOLOGIC EXAMINATION

The test panel included 3 raters who were well trained in the histopathology of lymphoproliferative disorders; each rater had a different professional background (a dermatologist with expertise in dermatopathology with special reference to cutaneous lymphomas [G.B.], a pathologist with expertise in hematopathology [A.C.F.], and a surgical pathologist with experience in dermatopathology [M.S.]).

In order to assess the reliability and the consistency of the diagnostic criteria used by each investigator as well as the interobserver and intraobserver variability, all participants independently studied the specimens twice with an interval longer than 9 months between the 2 sessions. Additionally, none of the investigators knew the original diagnoses or the other investigators' findings or had access to clinical and follow-up data. For the first reading, the slides containing the specimens were randomly numbered from 1 to 86. For the second interpretation, the slides were renumbered with figures chosen from a table of randomly coupled numbers. The labeling of slides was performed by a person who did not take part in the histologic evaluation.

In order to determine the accuracy of the histologic diagnoses and taking into account the degree of variation linked to the presence of initial lymphomatous lesions in the present series, sensitivity and false-negative rates were calculated both for the whole series and separately for early and later lymphomatous lesions; specificity was evaluated for the control series. For this purpose, specimens representative of the initial phases of the lymphomatous process were those obtained from patients with stage IA disease (namely, limited plaques, papules, or eczematous patches covering less than 10% of the body surface) at least 5 years before any progression of the disease towards more advanced stages. Twenty-four specimens from 18 patients fulfilled these criteria. The remaining 49 specimens represented later stages of disease.

The 3 investigators did not meet individually or collectively to discuss the histopathologic criteria or definitions, nor did they meet to agree on any approach to histologic evaluation in preparation for the study. Each investigator reviewed the specimens with his own experience and understanding of CTCL, including criteria crucial for the identification of early lesions and their proper differentiation from both inflammatory dermatoses and reactive lymphoid hyperplasias.

Diagnoses were identified as follows:

  • Definite lymphoma: lymphoma without any doubt;
  • Probable lymphoma: the histologic features are consistent with CTCL, but a diagnosis of lymphoma cannot be confidently made;
  • Possible lymphoma: the histologic features are not consistent with CTCL, but a diagnosis of lymphoma cannot be confidently excluded; and
  • Nonlymphoma: nonlymphoma without any doubt.

STATISTICAL ANALYSIS

The data obtained were statistically analyzed using the SPSS-X program (SPSS Inc, Chicago, Ill) for preparation of frequency tables and cross-tables.10

Interrater agreement was assessed by cross-tabulating the whole set of paired observations of the first reading into a symmetric square contingency table11 and producing separate tables for each single pair. The data layout for the analysis of intraobserver agreement consisted of contingency tables reporting the number of slides assigned by each rater to the different diagnostic categories on the first and second reading.

For the present study, we used the {kappa} statistics of Cohen.12 This measure incorporates a correction for chance and therefore indicates the degree of agreement over and above that which would be expected by chance alone. For example, {kappa} values that are greater than 61% may be taken to represent substantial to perfect agreement beyond chance. Values below 21% represent slight to poor agreement beyond chance, and values between 21% and 61% represent fair to moderate agreement beyond chance. Negative values denote less than chance agreement.13

Specific {kappa} values were calculated for each category. The original table was collapsed into a 2 x 2 table according to presence or absence of the specific categories analyzed. Overall {kappa} values and {chi}2 tests for homogeneity were used when appropriate. Ninety-five percent confidence intervals were computed according to the method of Fleiss.9

In addition, a weighted {kappa} value was calculated.14 The weighted {kappa} value takes into account the degree to which disagreements concern neighboring categories. We used the Fleiss-Cohen weights. When raters felt that they were unable to assign the biopsy specimen to 1 of the 4 categories, it was categorized as impossible to evaluate. This category was assigned a weight of zero.

In the intraobserver analysis, we evaluated both the crude agreement and the specific agreement on particular diagnoses (ie, the conditional probability of a specimen being reassigned to the same category, given that it had been assigned to that category once).

To assess accuracy, we tabulated specimens by true status and rater response separately for each rater. Sensitivity and false-negative rates were computed for both the whole series and early and later CTCL lesions. Specificity was analogously calculated for the control series. Ninety-five percent confidence intervals were computed from the binomial variance. Overall values were obtained as precision-weighted averages, and {chi}2 homogeneity test results were calculated.


RESULTS
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

INTEROBSERVER VARIABILITY (CONCORDANCE)

For each specimen, only the first reading was used. Three raters reading 86 specimens produced a total of 258 results (Table 1). In 7 instances, the raters were unable to assign the biopsy specimens to 1 of the 4 categories; these 7 readings were categorized as impossible to evaluate. The distribution of the possible 516 (86 x 3 x 2) paired ratings is presented in Table 1. The last column of Table 1 gives the {kappa} values and the 95% confidence intervals for each category and for the whole series. The multirater {kappa} value was 0.284, suggesting that there was a fair level of agreement among the 3 raters. Assigning different penalty weights to different degrees of disagreement resulted in a higher value for agreement among raters (weighted {kappa}, 0.412). The degree of agreement varied widely among categories; it was moderate ({kappa}, 0.500) for the definite lymphoma category, fair ({kappa}, 0.273) for the nonlymphoma category, and slight to poor for the other 3 categories.


View this table:
[in this window]
[in a new window]
Table 1. Agreement Among the 3 Observers in Categorizing the 86 Slides*


Both unweighted and weighted {kappa} values are presented for each single pair of raters in Table 2. The category-specific {kappa} values for the diagnosis of definite lymphoma and nonlymphoma for each single pair of raters are presented in Table 3. The results of {chi}2 analyses revealed the absence of significant heterogeneity among raters.


View this table:
[in this window]
[in a new window]
Table 2. Agreement Between Pairs of Observers*



View this table:
[in this window]
[in a new window]
Table 3. Category-Specific Agreement Between Pairs of Observers for Nonlymphoma and Definite Lymphoma*


INTRAOBSERVER VARIABILITY (REPRODUCIBILITY)

Table 4 shows the crude and specific agreement and {kappa} values for the individual raters between the first and second readings. Overall {kappa} and weighted {kappa} values ranged from 0.391 and 0.473 to 0.797 and 0.896, respectively. The results of {chi}2 analyses revealed highly significant heterogeneity among the 3 raters.


View this table:
[in this window]
[in a new window]
Table 4. Intraobserver Variability for All Cases and for Definite Lymphoma and Nonlymphoma*


ACCURACY

Lesion-specific sensitivity and false-negative rates estimated for each observer are reported in Table 5. Specificity equated with sensitivity for the control series. For each specimen, only the first reading was used. The results of {chi}2 analyses demonstrated highly significant heterogeneity among the 3 raters in identifying lymphoma cases; conversely, the differences observed in identifying controls were not statistically significant.


View this table:
[in this window]
[in a new window]
Table 5. Lesion-Specific Sensitivity and False-Negative Rates Estimated for Each Observer and Overall*



COMMENT
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

All anatomical pathology diagnoses are formed by value judgments that result from conscious interpretation of histologic imagery. Because these value judgments are ultimately subjective, it is no surprise that interpretative variability exists, even among experienced pathologists.

Previous studies in other fields (eg, lung cancer,15 proliferative breast lesions,16-17 cervical intraepithelial neoplasia,11, 18 cutaneous pigmented lesions19) have suggested that the ability of histopathologists to identify or subclassify certain lesions consistently and reproducibly has become a matter of legitimate concern. This was especially true when participants were asked to use the diagnostic criteria they employed in their daily practices and no attempt was made to standardize the diagnostic criteria among the participants before evaluations of the study cases.15-16,18-19 Conversely, there was generally less interobserver variation in the categorization when raters agreed to use the same diagnostic criteria and all study participants were provided with educational materials to maximize the likelihood that each had a similar level of understanding of these criteria.17

However, the fact that cutaneous biopsy specimens are so widely used to make diagnoses of CTCL was one of the reasons we deliberately avoided the definitions, agreements, or discussions of histologic criteria given in the literature.2-8,20-26 This enabled us to have some index of observer reproducibility, to have an understanding of each rater's concept of CTCL, and to determine the reliability of histologic criteria that, at the beginning of the study, were thought by all participants to be well understood and not in need of strict definition.

After the completion of the study, however, the 3 raters collectively met and discussed the criteria used by each of them for this investigation. Surprisingly, despite the differences observed in both sensitivity and specificity, the criteria used by the panelists were found to be almost identical; ie, they used those criteria already reported in detail in a previous study by the EORTC Cutaneous Lymphoma Project Group.8 Therefore, the differences observed were possibly a result of the different tuning or weighting of these criteria more than the use of personal criteria. In particular, the 3 raters unanimously agreed that the crucial features to establish a diagnosis of definite lymphoma were the presence of cells that were considered neoplastic and a disproportionate epidermotropism, while a diagnosis of nonlymphoma was confidently made only when the typical constellation of cytologic and architectural criteria2-8,20-26 considered indicative or suspicious for lymphoma was absent.

The difficulties in diagnosing CTCL are well known.2-8 Accurate diagnosis has as much to do with years of experience as it does with strict histologic criteria that can be applied by less-seasoned pathologists. Diagnostic accuracy and the reliability of conventional histopathologic features of CTCL, especially in the initial stages of the disease, have been reported to be extremely limited, even in the hands of experienced and well-trained pathologists.8, 27-28 In addition, investigations have documented that major interrater variability and intrarater variability among pathologists and dermatopathologists were common when evaluating skin biopsy specimens for the diagnosis of CTCL.8, 28 However, the real extent of the problem is not presently known, since studies dealing with the histologic diagnosis of CTCL have almost always been biased by several major problems. First, the real nature of the diseases featured in the slides was not determined—in fact, study cases did not generally have long-term follow-up data documenting the progression of the diseases or death of the patient caused by the diseases, leaving doubt as to the neoplastic nature of the lymphoproliferative disorder. Second, there was an absence of proper controls. Third, there were no clear statements of the protocol of the studies (ie, whether diagnoses were made with or without knowledge of clinical data). Fourth, there were inadequate statistical evaluations of the results obtained. Our investigation was designed to take these problems into account and minimize their impact on results.

We found significant concordance and reproducibility among examiners that exceeded statistical hazard. The levels of concordance and reproducibility found in this investigation are similar to those obtained with comparable studies in the most varied fields of pathology,11, 19, 28-35 thus pointing out that a certain degree of variability linked to the observer is a common phenomenon inherent to the human being and that CTCL does not evoke particular diagnostic problems, as commonly postulated.

In the present study, histologic diagnoses were made without the raters being given any clinical information. This may have negatively affected the diagnostic accuracy (Table 5). In fact, in other fields of pathology, when clinical information is provided, diagnostic accuracy increases,36 and errors in the reporting of biopsy findings are reduced to an acceptable minimum.37 This assumption is strengthened by the relatively high numbers of diagnoses of probable lymphoma (data not shown) that, if added to the diagnoses of definite lymphoma, would significantly raise sensitivity (79.6% for rater A, 73.5% for rater B, 84.9% for rater C). However, we decided to do a blinded study because if we had provided clinical information to the raters, its interpretation would have been yet another source of variation among them, which might have prejudiced the measurement of the reliability and reproducibility of the histologic diagnoses, which was the primary objective of our investigation. This objective is of particular importance for examining relationships between the histopathologic and clinical findings for CTCL. In fact, these data are important to the clinician treating the individual patient.

The level at which the histopathologic categories can be defined also affects reliability. We postulate that as the inconsistency of lesions increases, the accuracy of diagnoses will decrease. In fact, many authorities have stated that a specific diagnosis of CTCL cannot be made in the initial lymphomatous stages and more reliance has to be placed on the clinical picture than on the histologic features.

Our results did not confirm this. In fact, although for each examiner there was a trend toward a lower sensitivity in the detection of early lesions compared with later ones, the difference in sensitivity between the 2 groups was not statistically significant. This may be owing to the relatively small sample size of the early group, and further studies may help to determine if this difference is a significant one.

Our results stress the absolute need for the clearer definition and standardization of the histologic features of CTCL to improve the reliability of histologic diagnosis. This is crucial for the accurate identification of CTCL and its distinction from diseases with similar histologic features.


AUTHOR INFORMATION
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

Accepted for publication August 18, 1999.

The material for this study was collected by the European Organization for Research and Treatment of Cancer (EORTC) Cutaneous Lymphoma Project Group (Chairman: Günter Burg, MD) for the Symposia on the Histopathology of Early Mycosis Fungoides Lesions held in Ghent, Belgium, May 5-7, 1989, and in Würzburg, Germany, September 14-16, 1990.

The authors are indebted to the following colleagues who secured biopsy material with the pertinent clinical information and follow-up data used for the present investigation: M. Aelbrecht, MD, Ghent, Belgium; M. F. Avril, MD, Villejuif, France; E. Berti, MD, Milan, Italy; N. Bourgeois, MD, Antwerp, Belgium; G. Burg, MD, Würzburg, Germany; M. M. Delaunay, MD, Bordeaux, France; C. De Wolf-Peeters, MD, Leuven, Belgium; T. Estrach, MD, Barcelona, Spain; M. L. Geerts, MD, Ghent, Belgium; H. Kerl, MD, Graz, Austria; I. Koller, MD, Salzburg, Austria; K. Meissner, MD, Hamburg, Germany; C. Neumann, MD, Hannover, Germany; M. Nilles, MD, Giessen, Germany; E. Ralfkiaer, MD, Copenhagen, Denmark; N. Sepp, MD, Innsbruck, Austria; J. Wechsler, MD, Créteil, France. The authors pay particular tribute to Susanne Ziffer, MD, for selecting and randomizing the control slides for this study.

This work was done in part in the Departments of Dermatology and Pathology of the University of Würzburg School of Medicine, Würzburg, Germany; in the Department of Dermatology of the University of Zürich School of Medicine, Zürich, Switzerland; and the Institute of Anatomic Pathology of the University of Florence Medical School, Florence, Italy.

Corresponding author: Marco Santucci, MD, Dipartimento di Patologia Umana ed Oncologia, Università degli Studi di Firenze, Viale G. B. Morgagni 85, I-50134 Firenze, Italia (e-mail: Marco.Santucci{at}UNIFI.IT).

From the Dipartimento di Patologia Umana ed Oncologia (Dr Santucci) and Dipartimento Statistico (Dr Biggeri), Università degli Studi di Firenze, Florence, Italy; Institut für Pathologie, Medizinische Universität zu Lübeck, Lübeck, Germany (Dr Feller); and Dermatologischen Klinik, Universitätsspital Zürich, Zurich, Switzerland (Dr Burg). Additional information about the EORTC Cutaneous Lymphoma Project Group is available at http://www.eortc.be/menu.htm.


REFERENCES
 Jump to Section
 •Top
 •Introduction
 •Patients and methods
 •Results
 •Comment
 •Author information
 •References

1. Dunn G. Editorial. Stat Methods Med Res. 1992;1:121-122.
2. Burke JS. Malignant lymphomas of the skin: their differentiation from lymphoid and nonlymphoid cutaneous infiltrates that simulate lymphoma. Semin Diagn Pathol. 1985;2:169-182. ISI | PUBMED
3. Jones RE Jr. Questions to the editorial board and other authorities. Am J Dermatopathol. 1986;8:534-545.
4. LeBoit PE, Epstein BA. A vase-like shape characterizes the epidermal-mononuclear cell collections seen in spongiotic dermatitis. Am J Dermatopathol. 1990;12:612-616. ISI | PUBMED
5. Shapiro PE, Pinto FJ. The histologic spectrum of mycosis fungoides/Sézary syndrome (cutaneous T-cell lymphoma): a review of 222 biopsies, including newly described patterns and the earliest pathologic changes. Am J Surg Pathol. 1994;18:645-667. ISI | PUBMED
6. Smoller BR, Bishop K, Glusac E, Kim YH, Hendrickson M. Reassessment of histologic parameters in the diagnosis of mycosis fungoides. Am J Surg Pathol. 1995;19:1423-1430. ISI | PUBMED
7. Santucci M. Cutaneous T-cell lymphoma: clues to diagnosis in early lesions. In: Lambert WC, Giannotti B, van Vloten WA, eds. Basic Mechanisms of Physiologic and Aberrant Lymphoproliferation in the Skin. New York, NY: Plenum Publishing Corp; 1994:243-254. NATO Science Series A, Life Sciences; vol 265.
8. Burg G, Zwingers T, Staegemeir E, Santucci M for the EORTC-Cutaneous Lymphoma Project Group. Interrater and intrarater variabilities in the evaluation of cutaneous lymphoproliferative T-cell infiltrates. Dermatol Clin. 1994;12:311-314. ISI | PUBMED
9. Fleiss JL. Statistical Methods for Rates and Proportions. New York, NY: John Wiley & Sons; 1981:212-235.
10. Hull CH. SPSS-X Users Guide. New York, NY: McGraw-Hill Co; 1983.
11. Ismail SM, Colclough AB, Dinnen JS, et al. Observer variation in histopathological diagnosis and grading of cervical intraepithelial neoplasia. BMJ. 1989;298:707-710.
12. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37-46. FULL TEXT | ISI
13. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-174. FULL TEXT | ISI | PUBMED
14. Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull. 1968;70:213-220. FULL TEXT | ISI
15. Feinstein AR, Gelfman NA, Yesner R, et al. Observer variability in the histopathologic diagnosis of lung cancer. Am Rev Respir Dis. 1970;101:671-684. ISI | PUBMED
16. Rosai J. Borderline epithelial lesions of the breast. Am J Surg Pathol. 1991;15:209-221. ISI | PUBMED
17. Schnitt SJ, Connolly JL, Tavassoli FA, et al. Interobserver reproducibility in the diagnosis of ductal proliferative breast lesions using standardized criteria. Am J Surg Pathol. 1992;16:1133-1143. ISI | PUBMED
18. de Vet HCV, Knipschild PG, Schouten HJA, et al. Interobserver variation in histopathological grading of cervical dysplasia. J Clin Epidemiol. 1990;43:1395-1398. FULL TEXT | ISI | PUBMED
19. Duray PH, DerSimonian R, Barnhill R, et al. An analysis of interobserver recognition of the histopathologic features of dysplastic nevi from a mixed group of nevomelanocytic lesions. J Am Acad Dermatol. 1992;27:741-749. ISI | PUBMED
20. Burg G, Braun-Falco O. Cutaneous Lymphomas, Pseudolymphomas and Related Disorders. Berlin, Germany: Springer-Verlag; 1983.
21. Burg G, Kaudewitz P. Where are we today in the diagnosis of cutaneous lymphoma? Curr Probl Dermatol. 1990;19:90-104. PUBMED
22. Kerl H, Cerroni L, Burg G. The morphologic spectrum of T-cell lymphomas of the skin: a proposal for a new classification. Semin Diagn Pathol. 1991;8:55-61. ISI | PUBMED
23. LeBoit PE. Variants of mycosis fungoides and related cutaneous T-cell lymphomas. Semin Diagn Pathol. 1991;8:73-81. ISI | PUBMED
24. Nickoloff BJ. Light-microscopic assessment of 100 patients with patch/plaque-stage mycosis fungoides. Am J Dermatopathol. 1988;10:469-477. ISI | PUBMED
25. Rijlaarsdam U, Willemze R. Cutaneous pseudo-T-cell lymphomas. Semin Diagn Pathol. 1991;8:102-108. ISI | PUBMED
26. Sanchez JL, Ackerman AB. The patch stage of mycosis fungoides: criteria for histologic diagnosis. Am J Dermatopathol. 1979;1:5-26. PUBMED
27. Lefeber WP, Robinson JK, Clendenning WE, Dunn JL, Colton T. Attempts to enhance light microscopic diagnosis of cutaneous T-cell lymphoma (mycosis fungoides). Arch Dermatol. 1981;117:408-411. ABSTRACT
28. Olerud JE, Kulin PA, Chew DE, et al. Cutaneous T-cell lymphoma: evaluation of pretreatment skin biopsy specimens by a panel of pathologists. Arch Dermatol. 1992;128:501-507. ABSTRACT
29. Theodossi A, Skene AM, Portmann B, et al. Observer variation in assessment of liver biopsies including analysis by kappa statistics. Gastroenterology. 1980;79:232-241. ISI | PUBMED
30. Holman CDJ, Matz LR, Finlay-Jones LR, et al. Inter-observer variation in the histopathological reporting of Hodgkin's disease: an analysis of diagnostic subcomponents using kappa statistics. Histopathology. 1983;7:399-407. ISI | PUBMED
31. Heenan PJ, Matz LR, Blackwell JB, et al. Inter-observer variation between pathologists in the classification of cutaneous malignant melanoma in western Australia. Histopathology. 1984;8:717-729. ISI | PUBMED
32. National Cancer Institute Non-Hodgkin's Classification Project Writing Committee. Classification of non-Hodgkin's lymphomas: reproducibility of major classification systems. Cancer. 1985;55:91-95. FULL TEXT | ISI | PUBMED
33. Coindre JM, Trojani M, Contesso G, et al. Reproducibility of a histopathologic grading system for adult tissue sarcoma. Cancer. 1986;58:306-309. FULL TEXT | ISI | PUBMED
34. Cramer SF, Roth LM, Ulbright TM, et al. Evaluation of the reproducibility of the World Health Organization classification of common ovarian cancers with emphasis on methodology. Arch Pathol Lab Med. 1987;111:819-829. ISI | PUBMED
35. van Lijnschoten G, Arends JW, De La Fuente AA, Schouten HJA, Geraedts JPM. Intra- and inter-observer variation in the interpretation of histological features suggesting chromosomal abnormality in early abortion specimens. Histopathology. 1993;22:25-29. ISI | PUBMED
36. Reuben A, Johnson AL, Cotton BP. Is pancreatogram interpretation reliable? a study of observer variation and error. Br J Radiol. 1978;51:956-962. ABSTRACT
37. Baggenstoss AH. Morphologic and aetiologic diagnoses from hepatic biopsies without clinical data. Medicine. 1966;45:435-443. PUBMED

RELATED ARTICLES

Can Dermatopathologists Reliably Make the Diagnosis of Mycosis Fungoides?: If Not, Who Can?
Michael Ming and Philip E. LeBoit
Arch Dermatol. 2000;136(4):543-546.
EXTRACT | FULL TEXT  

Archives of Dermatology Reader's Choice: Continuing Medical Education
Arch Dermatol. 2000;136(4):568-569.
FULL TEXT  


THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES

Lymphoproliferative lesions of the skin.
Cerroni
J. Clin. Pathol. 2006;59:813-826.
ABSTRACT | FULL TEXT  

Can Dermatopathologists Reliably Make the Diagnosis of Mycosis Fungoides?: If Not, Who Can?
Ming and LeBoit
Arch Dermatol 2000;136:543-546.
FULL TEXT  





HOME | CURRENT ISSUE | PAST ISSUES | TOPIC COLLECTIONS | CME | SUBMIT | SUBSCRIBE | HELP
CONDITIONS OF USE | PRIVACY POLICY | CONTACT US | SITE MAP