 |
 |

Validation of a Melanoma Prognostic Model
David J. Margolis, MD;
Allan C. Halpern, MD;
Timothy Rebbeck, PhD;
Lynn Schuchter, MD;
Raymond L. Barnhill, MD;
Judith Fine, BA;
Marianne Berwick, PhD, MPH
Arch Dermatol. 1998;134:1597-1601.
ABSTRACT
 |  |
Background A "clinically accessible," 4-variable (patient age, patient sex, tumor location, and tumor thickness) prognostic model has been published previously. This model evaluated variables that were commonly available to the clinician. Because models are heuristic, validity of a prognostic model should be evaluated in a population different from the original population.
Objective To evaluate the external validity of this 4-variable melanoma prognostic model.
Design To estimate the external validity of this model, we used a population-based cohort of individuals with melanoma. We also evaluated a 1-variable model (tumor thickness). Estimates of the external validity of these logistic regression models were made using the c statistic and the Brier score.
Settings and Patients A total of 1261 patients with melanoma evaluated in a multispecialty, university-based practice and 650 patients with melanoma from throughout Connecticut.
Main Outcome Measure Death from melanoma within 5 years of diagnosis.
Results The c statistics for the 4-variable model were 0.86 (95% confidence interval [CI], 0.83-0.89) for the university-based practice data set and 0.81 (95% CI, 0.75-0.86) for the Connecticut data set. For thickness alone, the c statistics were 0.83 (95% CI, 0.80-0.86) and 0.79 (95% CI, 0.74-0.85), respectively. Brier scores for the 4-variable model were 0.09 (95% CI, 0.08-0.10) and 0.08 (95% CI, 0.06-0.09) and for the 1-variable model were 0.09 (95% CI, 0.08-0.10) and 0.08 (95% CI, 0.07-0.10), respectively. No significant differences exist between the data sets for the 4- and 1-variable models.
Conclusions The 4- and 1-variable models are generalizable. The simpler 1-variable modeltumor thicknesscan be used with a relatively small loss in accuracy.
INTRODUCTION
UNLIKE MOST skin cancers, melanoma contributes significantly to total cancer mortality.1-3 It is a common cause of death by cancer.4 The deaths due to melanoma have been increasing 2% per year for the past 3 decades.2 The incidence rate is estimated to be 11 to 16 per 100,000.3
Patient survival is associated with several prognostic factors.5-11 Of these, probably the most routinely used is tumor thickness, as described by Breslow.10 However, other prognostic factors, such as patient age and sex, site of primary tumor, and histological characteristics of the tumor (eg, Clark level), may also be associated with survival.6-9,12-14 In addition, multivariable prediction models may more accurately predict survival of patients with melanoma than individual prognostic factors.8 However, these models have not gained general acceptance, perhaps because of the difficulties that clinicians and pathologists have in measuring and interpreting the prognostic factors used in these models and the complexity of interpreting a multivariable model.
To create a more user-friendly prediction model for prognosis in melanoma, a model using prognostic factors commonly reported as part of a routine pathology report was proposed by Schuchter et al.6 The covariates in the model were patient age and sex, depth of tumor invasion (thickness), and the anatomical location of the lesion. The original model was established using a cohort of patients with a primary melanoma who were followed up for 10 years. It was reported that this 4-variable model predicted survival better than a model based on tumor thickness alone.
Because all models are heuristic, the validity of a prognostic model should be evaluated in populations different from the original population used in model building. Therefore, we evaluated the external validity of the prognostic model of Schuchter et al6 using a recently reported population-based cohort of individuals with melanoma who had been followed up for 5 years.15 We also compared the predictive ability of this 4-variable model to a 1-variable (tumor thickness) model.
PATIENTS, MATERIALS, AND METHODS
STUDY POPULATION
In 1996, Schuchter et al6 evaluated 488 individuals with melanoma from the Pigmented Lesion Group (PLG) at the University of Pennsylvania, Philadelphia, who had been followed up for at least 10 years from the time of melanoma diagnosis. This study evaluated only variables that were commonly available to the clinician. Four variables were found to be predictive of patient death from melanoma within 10 years of diagnosis using a multivariable logistic regression model: sex, age, tumor thickness, and tumor location. For the present study, we used a larger cohort (PLG) that includes 1261 individuals who have been followed up for up to 5 years.
The accuracy factor used by Schuchter et al6 to assess the 4-variable model was the area under the receiver operator characteristic (ROC) curve. The area was 0.87, which is a good score. The ROC curve is a graphical representation of the true-positive rate (eg, the probability that a positive test result is associated with a death from melanomaa correct response) vs the false-positive rate (eg, the probability that a positive test result is associated with a 5-year survivoran error). Unlike a single measurement, such as the percentage of correct responses of a test (eg, a prognostic model), the area under the ROC curve represents all possible responses and is generally believed to be the ideal measure of a test's ability to discriminate between outcomes.16-18
In 1996, Berwick et al15 studied a cohort (CT) with melanoma identified through the Rapid Case Ascertainment system of the Cancer Prevention Research Unit for Connecticut at Yale University Medical Center, New Haven. This Rapid Case Ascertainment system is an agent of the Connecticut Tumor Registry, which has functioned as 1 of the sites of the Surveillance, Epidemiology, and End Results program since 1973. The process used to ascertain cases has been described previously.15 These individuals had been followed up for up to 5 years from the time of melanoma diagnosis. This sample (n = 650) included all those with a validated histopathologic diagnosis of melanoma who lived in Connecticut between January 15, 1987, and May 15, 1989, and whose physician agreed to participate (physician participation rate was 75%). Although this is a population-based cohort, the histopathologic diagnosis and features of melanoma were confirmed and standardized by a single expert pathologist (R.L.B.). A subgroup of this cohort, individuals with localized invasive melanoma, has been used previously to evaluate prognostic factors for death from melanoma within 5 years of diagnosis.7
END POINT AND PROGNOSTIC FACTORS
In contrast to the study by Schuchter et al,6 who evaluated patients for death from melanoma within 10 years of diagnosis, the outcome for this study was death from melanoma within 5 years of diagnosis. In general, determination of death (see the "Comment" section) and the prognostic factors were ascertained in a similar manner in both studies.6, 15
The prognostic factors chosen for this validation study were based on the study by Schuchter et al6: tumor thickness (<0.76, 0.76-1.69, 1.70-3.60, and >3.60 mm), primary lesion location (extremities vs the rest of the body [labeled "trunk"]), age at diagnosis of melanoma ( 60 or >60 years), and patient sex.
STATISTICAL ANALYSIS
Descriptive statistics were computed for all measured prognostic factors and were reported as the total number of individuals who died of melanoma in the PLG and CT cohorts. The percentage of individuals with a given prognostic factor who died of melanoma within 5 years of diagnosis was also computed. Quantitative differences between those who survived or died within a cohort for a prognostic factor were estimated using Pearson 2or, if appropriate, 2 for trend.
Logistic regression was used to estimate odds ratios and 95% confidence intervals (CIs). All odds ratio estimates for a prognostic factor were reported as crude odds ratios and as odds ratios adjusted for the other measured prognostic factors.
Multivariable logistic regression was used by Schuchter et al6 to formulate the clinically friendly 4-variable prediction model for 10-year survival. This technique was also used in the present study to create a prediction for 5-year survival. In addition, a model was created using only Breslow depth of tumor invasion (ie, tumor thickness, <0.76, 0.76-1.69, 1.70-3.60, and >3.60 mm).10
The performance of each model was evaluated by the goodness-of-fit test (calibration) and discrimination. Brier scores were estimated as a measure of goodness-of-fit.19-27 Model discrimination was estimated by the c statistic, which is equivalent here to the area under the ROC curve.16 This is a widely used estimate of discrimination and is presented as the probability that 1 individual as opposed to another will survive 5 years.16-18,28-29 For the Brier score and c statistic, 95% CIs were calculated by the bootstrap technique using 1000 samples.30 Quantitative differences between Brier scores or c statistics for the 4- and 1-variable models were evaluated using a z statistic test. With the exception of melanoma prognostic modeling, these test statistics have rarely been used to evaluate dermatologic ailments.
In summary, a 4-variable prognostic model and a 1-variable prognostic model were created using the PLG data set and multivariable logistic regression. The accuracy of this model was estimated using the PLG and CT data sets. All statistical computations were performed using a software program (Stata version 5, Stata Corp, College Station, Tex), except for calculation of the z statistic, which was done manually.
RESULTS
One hundred sixty-seven (13.2%) of 1261 individuals in the PLG cohort died of melanoma within 5 years of diagnosis. Death from melanoma was significantly associated with individuals who had lesions on their trunk, who were male, who were older than 60 years, and who had thicker lesions (Table 1 and Table 2). There were no differences in the inferences from the analysis of either crude or adjusted odds ratios (Table 2).
|
|
|
|
Table 1. Individuals With a Prognostic Factor Who Died of Melanoma*
|
|
|
|
|
|
|
Table 2. Odds Ratios for Death From Melanoma Within 5 Years of Diagnosis for the Crude and Adjusted Prognostic Factors*
|
|
|
Eighty (12.3%) of 650 individuals in the CT cohort died of melanoma within 5 years of diagnosis. Death from melanoma was significantly associated with individuals who had thicker lesions (Tables 1 and 2). In contrast to the PLG data set, death from melanoma was not associated with lesion location, patient age, or patient sex (Tables 1 and 2). Except for patient sex, crude and adjusted odds ratio estimates for the association between prognostic factors and death from melanoma for tumor thickness were similar.
Using the 4-variable model, the c statistic was 0.86 (95% CI, 0.83-0.89) for the PLG data set and 0.81 (95% CI, 0.75-0.86) for the CT data set (Table 3). Using the 1-variable model (tumor thickness alone), the c statistic was 0.83 (95% CI, 0.80-0.86) for the PLG data set and 0.79 (95% CI, 0.74-0.85) for the CT data set. Brier scores for the 4-variable model were 0.09 (95% CI, 0.08-0.10) for the PLG data set and 0.08 (95% CI, 0.06-0.09) for the CT data set. Brier scores for the 1-variable model (thickness alone) were 0.09 (95% CI, 0.08-0.10) for the PLG data set and 0.08 (95% CI, 0.07-0.10) for the CT data set (Table 3).
|
|
|
|
Table 3. Brier Score and c Statistic Estimates for the 4-Variable and 1-Variable Models Predicting Death From Melanoma Within 5 Years of Diagnosis*
|
|
|
Therefore, at 5-year follow-up, the accuracy of the 4- or 1-variable model in predicting prognosis, as estimated by the Brier score and c statistic, was not significantly different. This was true when 1- and 4-variable models were compared using the PLG or CT data sets. With 1 exception, no statistically significant differences (P>.10) between Brier scores or c statistics were noted for any combination of models or populations. In 1 case, the P values comparing the accuracy of the 4- and 1-variable models within the PLG data set were .06 for the c statistic but greater than .10 for the Brier score.
COMMENT
Prognostic models are intended to predict the probability of an outcome (eg, disease or death) in a specific data set. The goal of this process is to develop an accurate model. Accuracy is the degree to which the predicted probability of an event agrees with the actual observed outcome in the data set. Ultimately, however, this heuristic approach is too simplistic because models are seldom created for use only in the original modeling data set (ie, for this study, the PLG data set).
A problem with the use of prognostic models is that the external validity or generalizability of a model cannot be described in conclusive terms.31 However, if a prognostic model is to be appropriately clinically applied in a different population of inference, its performance must be tested in an external (validation) data set. External validity refers to the extent to which the results of the model represent events seen in a referent population of interest.32 Lack of external validity cannot be corrected for statistically, so it is essential for a clinician to evaluate and understand the generalizability of a prognostic model to his or her own clinical setting.32-33 Estimates of generalizability can be determined by evaluating the accuracy of a predictive model in different clinical settings.
Although no consensus exists as to how a model should be validated, these assessments are often classified in terms of calibration (goodness of fit) and discrimination.20, 29, 34-37 Calibration is the degree to which the predicted probability agrees with the actual event. Discrimination is the degree to which a prediction from the model can separate those who will have the outcome from those who will not. For example, a well-calibrated model can correctly predict that an individual has a 60% chance of dying of melanoma within 5 years of diagnosis, whereas a discriminating model correctly distinguishes between who will die of melanoma within 5 years of diagnosis and who will not. When models lack validity, it is because the predictions do not differentiate among those who will have the outcome and those who will not (poor discrimination) or because predictions from a model do not estimate the average rate of the outcome experienced by an individual within a particular subgroup (lack of calibration). Therefore, the generalizability of a model is best estimated by determining both calibration and discrimination using data from many sites.28, 38-39
Many statistical techniques exist to measure a model's goodness of fit. A commonly used technique in epidemiological studies is the Hosmer-Lemeshow statistic, a modification of a Pearson 2test. It has not, however, been used to make comparisons between studies, and the estimates of calibration of this test are sensitive to sample size. For this reason, in the present study, we chose to use the Brier score. Brier scores are commonly used by atmospheric scientists to summarize the forecast performance of their models and have been used by epidemiologists.20, 23, 25-27,40 A Brier score is the average of the mean squared error of the predicted and the observed event for any data set.19-23,25-27,40 Scores can vary between 0 and 1. A more accurate model has a Brier score closer to 0. A model that agrees with the known outcome 50% of the time and disagrees with it 50% of the time would have a Brier score of 0.25.40 The greatest advantage of the Brier score is that it is a measure of both discrimination and calibration. It can be decomposed to express estimates of discrimination and calibration, and it has been used in the past to make comparisons between study samples.40-41
To evaluate discrimination, we used a c statistic, which for the type of models presented in this study is equivalent to the area under the ROC curve.16 This is a common approach that is widely used to evaluate discrimination. The c index is calculated by constructing a set of all possible pairs of patients that are discordant for their outcomes. All pairs for which the prognostic score is greater for the patient with the positive outcome are given a score of 1, pairs for which the prognostic score is tied are scored as 0.5, and pairs for which the prognostic score is greater for the patient with the negative outcome are scored as 0.16, 28 The c index is the total score over the total number of possible pairs discordant for the outcome. This ratio has a value from 0 to 1, with 1 being a perfect positive predictive value, 0.5 being no predictive value, and 0 being a perfect negative predictive value. A c statistic higher than 0.7 can be thought of as acceptable, higher than 0.8 can be thought of as good, and higher than 0.9 can be thought of as excellent.42
When using Brier scores and c statistics as measures of prognostic accuracy, the 4-variable model predicted well in the CT population. However, tumor thickness alone (1-variable model) accurately predicted death from melanoma within 5 years of diagnosis as well as the 4-variable model and was externally valid in the CT population. It is remarkable that these models seem to be generalizable between these 2 different populations. The PLG cohort is made up of patients attending a multispecialty practice at the University of Pennsylvania Medical Center devoted to patients with melanocytic lesions. The CT cohort is a population-based group of individuals with melanoma identified by the Rapid Case Ascertainment system of the Cancer Prevention Research Unit for Connecticut at Yale University Medical Center. One would expect that diagnosis and treatment of patients might be substantially different between a specialty clinic at an academic institution and all of the diverse patient care locations used statewide in Connecticut.
Important differences do, however, exist between the data sets (Tables 1 and 2). One potential explanation of these differences may be selection bias. For example, 25% of individuals older than 60 years died of melanoma in the PLG data set but only 14% of individuals older than 60 years died of melanoma in the CT data set. In the CT data set, only 75% of eligible patients were interviewed and, therefore, initially entered into the CT data set. A comparison of those who were not interviewed (because of death, physician refusal, or patient refusal often because of illness) shows that they were slightly older and had slightly thicker tumors than the interviewed patients. The magnitude and direction of this bias on the prognostic factors is unknown. As a result, the restriction of cases to "death from melanoma" might have been too strict, and additional evaluations using all-cause mortality should be conducted. Finally, there may be important differences among the individuals who choose to be cared for in a specialty center like the PLG and those who live in Connecticut. For example, individuals with a family history of melanoma might seek care out of state in a PLG-like practice.
A limitation of this study is the 5 years of follow-up. Although most people may die of melanoma within 5 years of diagnosis, many still die of melanoma later than 5 years after diagnosis. The original model of Schuchter et al6 using the PLG data set followed up patients for 10 years. When available, data sets with 10 years of follow-up after diagnosis should be used to fully evaluate the 4- and 1-variable models. Of the variables studied, tumor thickness may be the only variable required to predict the chance of death from melanoma within 5 years of diagnosis, and that the additional variables are needed to accurately predict death from melanoma within 10 years of diagnosis.
In summary, our results show that the models created from the PLG data set are generalizable to the CT population. The simpler 1 variabletumor thicknesscan be used with a relatively small loss in accuracy. However, deaths caused by melanoma do occur more than 5 years after diagnosis. Therefore, when data sets with 10 years of follow-up become available, both the 1- and 4-variable models should be reevaluated. In the future, more accurate and generalizable multivariable models will likely include among their prognostic factors molecular and biologic attributes of the primary tumor (eg, mutations in oncogenes and measures of angiogenesis) and markers of metastatic capacity (eg, nodal staging and assessment of blood for messenger RNA for tyrosinase).
AUTHOR INFORMATION
Accepted for publication July 23, 1998.
This work was funded partially by grants AG 00715, CA 42101, and CA 75434 from the National Institutes of Health, Bethesda, Md, and a Biomedical Research Support grant from the Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Conn.
Presented in part as an oral abstract at the International Dermatoepidemiology Assocation meeting, Orlando, Fla, February 26, 1998.
We thank the following institutions, without whose assistance this research would not have been possible: Connecticut Dermatopathology Laboratory Inc, Laboratory of Hope-Ross & Portenoy, University of Connecticut Dermatopathology Laboratory, Yale Dermatopathology Laboratory, Hartford Hospital, YaleNew Haven Hospital, St Francis Hospital & Medical Center, Bridgeport Hospital, Waterbury Hospital, Hospital of St Raphael, Danbury Hospital, New Britain General Hospital, Norwalk Hospital, St Vincent's Medical Center, The Stamford Hospital, Middlesex Hospital, Mt Sinai Hospital, St Mary's Hospital, Lawrence & Memorial Hospital, Manchester Memorial Hospital, Greenwich Hospital Association, Veterans Memorial Medical Center, Griffin Hospital, Bristol Hospital, St Joseph Medical Center, UCONN Health Center/John Dempsey Hospital, William W Backus Hospital, Park City Hospital, Charlotte Hungerford Hospital, Windham Community Memorial Hospital, Milford Hospital, Day Kimball Hospital, Rockville General Hospital, Bradley Memorial Hospital, The Sharon Hospital, New Milford Hospital, Johnson Memorial Hospital, Winstead Memorial Hospital, Westerly (Rhode Island) Hospital and the Pigmented Lesion Group, and the University of Pennsylvania Medical Center. In addition, we thank Dupont Guerry, MD, for his editorial assistance and S. Masiak for her secretarial assistance.
Reprints: David J. Margolis, MD, Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Room 815, Blockley Hall, 423 Guardian Dr, Philadelphia, PA 19104-6021 (e-mail: Margolis{at}cceb.med.upenn.edu).
From the Departments of Dermatology (Dr Margolis) and Biostatistics and Epidemiology (Drs Margolis and Rebbeck), University of Pennsylvania School of Medicine, and University of Pennsylvania Cancer Center (Dr Schuchter), Philadelphia; the Department of Dermatology (Dr Halpern) and the Epidemiology Service (Dr Berwick), Memorial Sloan-Kettering Cancer Center, New York, NY; the Department of Dermatology, Johns Hopkins University School of Medicine, Baltimore, Md (Dr Barnhill); and Rapid Case Ascertainment Shared Resources, Yale University School of Medicine, Yale Cancer Center, New Haven, Conn (Ms Fine).
REFERENCES
 |  |
1. Swerlick RA, Chen S. The melanoma epidemic: more apparent than real? Mayo Clin Proc. 1997;72:559-564.
ISI
| PUBMED
2. Rigel DS, Friedman RJ, Kopf AW, Robinson JK, Amonette RA. Melanoma incidence: if it quacks like a duck... . Arch Dermatol. 1997;133:656-659.
FULL TEXT
|
ISI
| PUBMED
3. Merrill RM, Feuer EJ. Risk-adjusted cancer-incidence rates (United States). Cancer Causes Control. 1996;7:544-552.
FULL TEXT
|
ISI
| PUBMED
4. Boring CC, Squires TS, Tong T. Cancer Statistics, 1993. CA Cancer J Clin. 1993;43:7-26.
ISI
| PUBMED
5. Halpern AC, Guerry D IV, Elder DE, Trock B, Synnestvedt M. A cohort study of melanoma in patients with dysplastic nevi. J Invest Dermatol. 1993;100(suppl):346S-349S.
6. Schuchter L, Schultz DJ, Synnestvedt M, et al. A prognostic model for predicting 10-year survival in patients with primary melanoma: the Pigmented Lesion Group. Ann Intern Med. 1996;125:369-375.
FREE FULL TEXT
7. Barnhill RL, Fine JA, Roush GC, Berwick M. Predicting five-year outcome for patients with cutaneous melanoma in a population-based study. Cancer. 1996;78:427-432.
FULL TEXT
|
ISI
| PUBMED
8. Clark WH Jr, Elder DE, Guerry D IV, et al. Model predicting survival in stage I melanoma based on tumor progression. J Natl Cancer Inst. 1989;81:1893-1904.
FREE FULL TEXT
9. Clark WH Jr, Ainsworth AM, Bernardino EA, Yang EA, Mihm CM. The developmental biology of primary human malignant melanoma. Semin Oncol. 1975;2:83-103.
PUBMED
10. Breslow A. Thickness, cross-sectional area and depth of invasion in the prognosis of cutaneous melanoma. Ann Surg. 1970;172:902-908.
ISI
| PUBMED
11. Balch CM, Murad TM, Soong SJ, Ingalis AL, Halpern NB, Maddox WA. A multifactorial analysis of melanoma. Ann Surg. 1978;188:732-742.
ISI
| PUBMED
12. Duncan LM, Berwick M, Bruijn JA, Byers HR, Mihm MC, Barnhill RL. Histopathologic recognition and grading of dysplastic melanocytic nevi: an interobserver agreement study. J Invest Dermatol. 1993;100(suppl):318S-321S.
13. Hartge P, Holly EA, Halpern A, et al. Recognition and classification of clinically dysplastic nevi from photographs: a study of interobserver variation. Cancer Epidemiol Biomarkers Prev. 1995;4:37-40.
ABSTRACT
14. Piepkorn MW, Barnhill RL, Cannon-Albright LA, et al. A multiobserver, population-based analysis of histologic dysplasia in melanocytic nevi. J Am Acad Dermatol. 1994;30(pt 1):707-714.
15. Berwick M, Begg CB, Fine JA, Roush GC, Barnhill RL. Screening for cutaneous melanoma by skin self-examination. J Natl Cancer Inst. 1996;88:17-23.
FREE FULL TEXT
16. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29-36.
FREE FULL TEXT
17. Swets JA. Measuring the accuracy of diagnostic systems. Science. 1988;240:1285-1293.
FREE FULL TEXT
18. Swets JA. Signal Detection Theory and ROC Analysis in Psychology and Diagnostics. Mahwah, NJ: Lawrence Erlbaum Associates; 1996:chap 4.
19. Wilks DS. Forecast verification. In: Wilks DS, ed. Statistical Methods in the Atmospheric Sciences. San Diego, Calif: Academic Press Inc; 1995:233-283.
20. Arkes HR, Dawson NV, Speroff T, et al. The covariance decomposition of the probability score and its use in evaluating prognostic estimates: SUPPORT Investigators. Med Decis Making. 1995;15:120-131.
FREE FULL TEXT
21. Knorr KL, Hilsenbeck SG, Wenger CR, et al. Making the most of your prognostic factors. Breast Cancer Res Treat. 1992;22:251-262.
FULL TEXT
|
ISI
| PUBMED
22. Camma C, Garofalo G, Almasio P, et al. A performance evaluation of the expert system "Jaundice" in comparison with that of three hepatologists. J Hepatol. 1991;13:279-285.
FULL TEXT
|
ISI
| PUBMED
23. Dolan JG, Bordley DR, Mushlin AI. An evaluation of clinicians' subjective prior probability estimates. Med Decis Making. 1986;6:216-223.
FREE FULL TEXT
24. Chen YT, Dubrow R, Holford TR, et al. Malignant melanoma risk factors by anatomic site. Int J Cancer. 1996;67:636-643.
FULL TEXT
|
ISI
| PUBMED
25. Murphy AH. What is a good forecast? an essay on the nature of goodness in weather forecasting. Weather Forecast. 1993;8:281-293.
FULL TEXT
26. Epstein ES. Long-range weather prediction: limits of predictablity and beyond. Weather Forecast. 1988;3:69-75.
27. Brier GW. Verification of forecasts expressed in terms of probability. Monthly Weather Rev. 1950;78:1-3.
FULL TEXT
28. Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models. Stat Med. 1996;15:361-387.
FULL TEXT
|
ISI
| PUBMED
29. Hosmer DW, Lemeshow S. Applied Logistic Regression. New York, NY: John Wiley & Sons Inc; 1989:chap 5.
30. Effron B, Gong G. A leisurely look at the bootstrap, the jackknife, and cross validation. Am Stat. 1983;37:36-48.
31. Campbell DT, Stanley JC. Experimental Design and Quasi-Experimental Designs for Research. Boston, Mass: Houghton Mifflin Co; 1963:1-84.
32. Rothman KJ. Modern Epidemiology. Boston, Mass: Little Brown & Co Inc; 1986:chap 7.
33. Silberman G, Droitcour JA, Scullin EW. Cross Design Synthesis: A New Strategy for Medical Effectiveness Research. Gaithersburg, Md: US General Accounting Office; 1992.
34. Hosmer DW, Lemeshaw S. Goodness of fit tests for multiple logistic regression model. Commun Stat. 1980;A10:1063-1069.
35. Justice AC. The Development, Validation, and Evaluation of Prognostic Systems: An Application to the Acquired Immunodeficiency Syndrome (AIDS). Philadelphia: University of Pennsylvania; 1996:1-272.
36. Hlatky MA, Califf RM, Harrell FE Jr, et al. Clinical judgment and therapeutic decision making. J Am Coll Cardiol. 1990;15:1-14.
ABSTRACT
37. Hlatky MA, Mark DB, Harrell FE Jr, Lee KL, Califf RM, Pryor DB. Rethinking sensitivity and specificity. Am J Cardiol. 1987;59:1195-1198.
FULL TEXT
|
ISI
| PUBMED
38. Harrell FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984;3:143-152.
ISI
| PUBMED
39. Justice AC, Aiken LH, Smith HL, Turner BJ. The role of functional status in predicting inpatient mortality with AIDS. J Clin Epidemiol. 1996;49:193-201.
FULL TEXT
|
ISI
| PUBMED
40. Yates FJ. External correspondence: decompositions of the mean probability score. Organ Behav Hum Perform. 1982;30:132-156.
FULL TEXT
41. Redelmeirer DA, Bloch DA, Hickam DH. Assessing predictive accuracy: how to compare Brier scores. J Clin Epidemiol. 1991;44:1141-1146.
FULL TEXT
|
ISI
| PUBMED
42. Murphy-Filkins R, Teres D, Lemeshow S, Hosmer DW. Effect of changing patient mix on the performance of an intensive care unit severity-of-illness model. Crit Care Med. 1996;24:1968-1973.
FULL TEXT
|
ISI
| PUBMED
RELATED ARTICLE
Archives of Dermatology Reader's Choice: Continuing Medical Education
Arch Dermatol. 1998;134(12):1640-1641.
FULL TEXT
|