 |
 |

Evidence-Based Medicine in a Nutshell
A Guide to Finding and Using the Best Evidence in Caring for Patients
Michael Bigby, MD
Arch Dermatol. 1998;134:1609-1618.
ABSTRACT
 |  |
Evidence-based medicine is the use of the best current evidence in making decisions about the care of individual patients. Practicing EBM requires recognition that in most encounters with patients, questions arise that should be answered to provide the patient with the best available medical care. Asking well-built clinical questions that contain 4 elementsa patient or problem, an intervention, a comparison intervention (if necessary), and an outcomeis an important step in practicing EBM. Once appropriate questions have been formulated, the best source for finding most types of best evidence is by searching the MEDLINE database by computer. MEDLINE searches have inherent software and operator limitations that make their reliability quite variable. One should be aware of these limitations and improve one's skills in searching. The Cochrane Collaboration Controlled Clinical Trials Registry contains more than 190,000 controlled clinical trials and is the best source of evidence about treatment. The quality (strength) of evidence is based on a hierarchy of evidence: results of systematic reviews of well-designed clinical studies, results of 1 or more well-designed clinical studies, results of large case series, expert opinion, and personal experience. Once the best evidence has been found, the EBM approach involves critically appraising the quality of the evidence, determining its magnitude and precision, and applying it to the specific patient. Guidelines to critically appraise and apply evidence are available. The clinical question, best evidence, and its critical appraisal should be saved in a format that can be easily retrieved for future use.
INTRODUCTION
Evidence-based medicine (EBM) is the conscientious, explicit, and judicious use of the best current evidence in making decisions about the care of individual patients.1 Practicing EBM requires 4 steps: formulating well-built clinical questions, finding the best evidence to answer the questions, critically appraising the evidence, and applying the evidence to specific patients. The guiding principles and skills required to practice EBM will be briefly reviewed.
To help understand and use the principles and techniques of EBM, consider the following commonly encountered patient. An otherwise healthy, middle-aged man who has severely dystrophic toenails specifically wants to know (1) whether he has toenail fungus and (2) what is the best way to get rid of it.
On physical examination, 6 of his toenails are severely dystrophic and have subungual debris under the distal one half to one third. After being examined and discussing his diagnosis, potential treatments, and their side effects, he also wants to know (3) how frequently the treatment causes liver disease.
Practicing EBM requires recognition that in most encounters with patients, questions arise that should be answered to provide the patient with the best available medical care. In many cases, we know the answers to these questions and are able to deliver the best available care. In many instances, the answers to the questions exist and we have not yet accessed them. Finally, in many cases, the answers to the questions are not known because they have not been adequately addressed or addressed at all.
Asking well-built clinical questions may be the most important step in practicing EBM. A well-built clinical question has 4 elements: a patient or problem, an intervention, a comparison intervention (if necessary), and an outcome.2-3 Examples of well-built clinical questions about our patient with dystrophic toenails might include the following:
In a patient with dystrophic toenails, should a potassium hydroxide (KOH) test or culture be done to establish a diagnosis of onychomycosis?
In a patient with onychomycosis, would treatment with terbinafine or itraconazole be more likely to lead to cured nails (ie, normal nails with negative cultures and KOH test results)?
In an otherwise healthy, middle-aged, male patient, how frequently does terbinafine or itraconazole cause clinically significant liver disease? Note that each question has a specific patient, intervention, comparison intervention, and explicit clinical outcome.
WELL-BUILT CLINICAL questions about individual patients can be grouped into several categories: diagnosis, therapy, prognosis, harm, and prevention. A major benefit of careful and thoughtful question forming is that the search for evidence is easier.2-3 The well-formed question makes it relatively straightforward to elicit and combine the appropriate terms needed to represent your need for information in the query language of whichever searching service of the MEDLINE database is available to you.2-3 Having to formulate well-built clinical questions will also train you to clearly define your patient, be specific about the interventions used, and choose carefully described and precise desired outcomes.
Once appropriate questions have been formulated, what are the best sources for the best evidence to answer these questions? Potential sources include personal experience, colleagues or experts, textbooks, articles published in journals, and systematic reviews. An important principle of EBM is that the quality (strength) of evidence is based on a hierarchy of evidence. In descending order, this hierarchy consists of results of systematic reviews of well-designed studies, results of 1 or more well designed studies, results of large case series, expert opinion, and personal experience.4-5 The ordering of this hierarchy has been widely discussed, actively debated, and sometimes hotly contested.3, 6-12
A systematic review is an overview that contains a thorough, unbiased search of the relevant literature, explicit criteria for assessing studies, and structured presentation of the results. A systematic review that uses quantitative methods to summarize results is a meta-analysis.13-15 Meta-analysis provides an objective and quantitative summary of evidence that is amenable to statistical analysis.16 Meta-analysis is credited with allowing recognition of important treatment effects by combining the results of small trials that individually lacked the power to demonstrate differences among treatments. For example, the benefits of intravenous streptokinase in acute myocardial infarction was recognized by the results of a cumulative meta-analysis of smaller trials at least a decade before it was recommended by experts and before it was demonstrated to be efficacious in large clinical trials.16-17 Meta-analysis has been criticized for the discrepancies between the results of meta-analysis and those of large clinical trials.6-8,16 For example, results of a meta-analysis of 14 small studies of calcium to treat preeclampsia showed benefit of treatment, whereas a large trial failed to show a treatment effect.16 The frequency of discrepancies ranges from 10% to 23%.16 Discrepancies can often be explained by differences in treatment protocols, heterogeneity of study populations, or changes that occur over time.16 Not all systematic reviews and meta-analyses are equal. Methods for assessing the quality of each type of analysis are available.6-8,13, 18
The type of clinical study that constitutes best evidence is determined by the category of the question being asked. Questions about diagnosis are best addressed by cohort studies.19-20 Questions about therapy and prevention are best addressed by randomized controlled trials.21-22 Questions about prognosis and harm are best addressed by cohort studies or case-control studies.23 Methods for assessing the quality of each type of evidence are available.13, 18 Examples of studies to address questions of diagnosis, treatment, and harm will be illustrated.
Expert opinion can be valuable particularly for rare conditions in which the expert has the most experience or when other forms of evidence are not available. Experts should be aware of the quality of evidence that exists. Whereas personal experience is an invaluable part of becoming a competent physician, the pitfalls of relying too heavily on personal experience have been widely documented.9, 11, 24
All of these sources are useful under certain circumstances. However, how does one go about finding the best evidence? The single best source for finding most types of best evidence in dermatology is by searching the MEDLINE database by computer.3, 9 MEDLINE searches have inherent software and operator limitations that make their reliability quite variable.25-29 One should be aware of these limitations. One's skills in searching will improve with experience.
For example, Spuls et al30 conducted a systematic review of systemic treatments of psoriasis. Treatments analyzed included UV-B, psoralenUV-A, methotrexate, cyclosporine, and retinoids. The authors used an exhaustive strategy to find relevant references, including MEDLINE searches, contacting pharmaceutical companies, polling leading authorities, reviewing abstract books of symposia and congresses, and reviewing textbooks, reviews, editorials, guideline articles, and the reference lists of all articles identified. Of 665 studies found, 356 (53.5%) were identified by MEDLINE search (range, 30% to 70% for different treatment modalities). No references beyond those identified by MEDLINE searching were provided by the 17 of 23 authorities who responded.
MEDLINE is the National Library of Medicine's bibliographic database covering the fields of medicine, nursing, dentistry, veterinary medicine, the health care system, and the preclinical sciences.31 The MEDLINE file contains bibliographic citations and author abstracts from approximately 3900 current biomedical journals published in the United States and 70 foreign countries.31 The file contains approximately 9 million records dating back to 1966.31 Coverage is worldwide, but most records are from English-language sources or have English abstracts.31 Citations for MEDLINE are created by the National Library of Medicine, International MEDLARS partners, and cooperating professional organizations.31 MEDLINE is not the only electronically accessible bibliographic database. EMBASE is Excerpta Medica's database covering drugs, pharmacology, and biomedical specialties.13 EMBASE has better coverage of European and nonEnglish-language sources and may be more up-to-date.13 The overlap in journals covered by MEDLINE and EMBASE is about 34% (range, 10% to 75%, depending on the subject).15, 32 Unfortunately, most currently available platforms for physician MEDLINE searches in the United States are not capable of searching the EMBASE database. EMBASE searches are available in many libraries.
More than 20 vendors of MEDLINE on-line and on CD are available. Haynes et al25 compared several vendors of MEDLINE on-line and on CD to determine which was best in terms of finding relevant articles and excluding irrelevant articles. Assessed on combined rankings for the highest number of relevant and the lowest number of irrelevant citations retrieved, SilverPlatter CD-ROM MEDLINE clinical journal subset performed best for librarians' searches, whereas PaperChase on-line system worked best for clinician searches. For cost per relevant citation, Dialog's Knowledge Index performed best for both librarian and clinician searches.25
MEDLINE SEARCHING
Regardless of the platform used, the key to MEDLINE searching is to find relevant articles and to exclude irrelevant citations. Several useful techniques can greatly aid your ability to accomplish this goal. Searches generally are done on the basis of Boolean combinations of search terms. For example, our search for best evidence about drug treatment of onychomycosis might read [onychomycosis and (terbinafine or itraconazole or fluconazole) and not case reports]. (The convention used throughout for search entries is that Boolean operations within parentheses are done first.) This search would identify articles on onychomycosis with the use of any of the listed drugs and excluding case reports.
It is important to understand the difference between text word and Medical Subject Headings (MeSH) searching and to be able to do both.33 Many of the programs used to search the MEDLINE database automatically do text word and MeSH searches. The MeSH terms include all of the terms in the Medical Subject Headings, a controlled vocabulary of key words used to index MEDLINE.34 Each MEDLINE citation is given a group of MeSH terms that relate to the subject of the article from which it is drawn.34 Frequently, MeSH terms will have an additional subheading, which further defines how the MeSH term relates to the article with which it is associated.26, 34 This subheading is appended to the MeSH term, eg, "onychomycosis diagnosis."
Indexing articles is not an exact science. The MeSH headings assigned by the National Library of Medicine may not coincide with the intent of the author or the majority of searchers for several reasons. Authors may not clearly express their intent, indexers are usually not experts in the field of the article they are indexing, and the mistakes associated with doing repetitive tasks occur.15 Relevant articles may be missed when they are not assigned the appropriate MeSH heading. Irrelevant articles may be included in a MeSH search if they are assigned to the wrong MeSH heading. For example, the Cochrane Collaboration identified major problems in the MEDLINE indexing of randomized controlled trials.15, 35-36
Text word searches allow one to search articles for words within the title and abstract that are important to and coincide with the intent of the author. However, text word searches are subject to several problems. Authors may not describe their methods or objectives well or may make errors in spelling, omission, or commission.15 The problem of misspelling can be illustrated by doing a text word search for pruritis (pruritus spelled incorrectly). This search will yield more than 40 references in which the word has been misspelled. Many of these references may not be detected in a search for pruritus (spelled correctly).
Boolean topic searches will often contain too many references or too few. They may contain many irrelevant citations and miss many relevant citations. Several techniques will help make searches more sensitive (ie, pick up relevant citations) and more specific (ie, exclude irrelevant citations).26 To increase the sensitivity of searches, searching both text word and MeSH headings, "exploding" MeSH headings, and using truncation may be helpful. The MeSH term searches can be exploded to include all terms that are logical subsets of the term entered.37 For example, exploding the MeSH term onychomycosis will retrieve all of the articles that use that MeSH term, whether they have subheadings or not. Many of the programs used to search the MEDLINE database automatically explode searches of MeSH terms or MeSH major topics.
Truncation refers to searching by means of the root of a word to allow variants of the word to be detected. For example, a search of onychomycosis and controlled clinical trial will detect fewer studies than a search for onychomycosis and control$ (where control$ contains a wild card that will allow detection of all words that begin with the root control). Truncation can be performed on text word and MeSH heading searches.
To increase the specificity of searches, selecting specific subheadings of MeSH terms and limiting the search may be helpful. The MeSH heading terms can be limited to specific subheadings to help to narrow search results to relevant articles. For example, onychomycosis has subheadings that restrict retrieved articles to ones dealing with diagnosis or drug treatment.37 Searches can be limited in many ways, including publication type, language, human subjects, and date of publication. Restricting the publication type to randomized controlled trial or case control study is useful to limit retrieved articles to those of highest quality.
Performing a sensitive or specific search from scratch is often a time-consuming task. Arriving at an efficient search strategy to suit one's particular needs is sometimes a work of art. Once accomplished, it is important to be able to edit, save, and retrieve the search strategy. The saved strategy can then be used in future searches of different subjects without having to rethink or retype the whole search procedure.
The methods for performing these techniques (text word and MeSH searching, exploding, truncation, using subheadings, limiting, and saving) vary by platform used. Mastering them will greatly improve searching efficiency.
Specific search strategies, "filters," have been developed to help find relevant references and exclude irrelevant references for best evidence about diagnosis, therapy, prognosis, harm, and prevention.26 These filters have been incorporated into the PubMed search engine of the National Library of Medicine and are available at http://www.ncbi.nlm.nih.gov/PubMed/clinical.html.38 Some of the filters will be discussed below.
OTHER SOURCES OF EVIDENCE
Several specialized sources for finding best evidence are available. The Cochrane Collaboration maintains a Controlled Clinical Trials Registry that contains more than 190,000 controlled clinical trials. The database is compiled by searching the MEDLINE and EMBASE databases and hand searching many journals. It is the most complete database of randomized controlled trials and is the best source for evidence about treatment. The Controlled Clinical Trials Registry can be searched easily with the use of simple Boolean combinations of search terms and by more sophisticated search strategies as already discussed. The Controlled Clinical Trials Registry is published as part of the Cochrane Library. The Cochrane Library is available on a subscription basis on CD, and on the World Wide Web from Update Software (http://www.cochrane.co.uk). Subscriptions to the Cochrane Library are updated quarterly but are unfortunately relatively expensive. The Cochrane Library is not available in many libraries but should be.
Bullet points of EBM (reviews and EBM-related articles) are published in Bandolier, which is available on the Internet at http://www.jr2.ox.ac.uk/Bandolier/.39 Reviews of treatment of warts and on the new oral antifungals have appeared in it. Structured abstracts of articles are published in the journals ACP Journal Club and Evidence-Based Medicine. The articles are strictly selected on the basis of methodological quality and are accompanied by commentary putting the information in clinical perspective. All of the articles from ACP Journal Club and Evidence-Based Medicine from January 1991 to December 1996 are available on CD. Systematic reviews produced by members of the Cochrane Collaboration are published in the Cochrane Library. However, these sources contain few articles on dermatological topics. This lack of systematic reviews of dermatological topics may be corrected by the Cochrane Skin Group, which was formed in 1996. The members of the Cochrane Skin Group prepare systematic reviews of dermatological topics. These reviews will be available in the Cochrane Library.
Once the best evidence has been found, the EBM approach involves critically appraising the quality of the evidence, determining its magnitude and precision, and applying it to the specific patient. Examples of finding and using evidence about diagnosis, treatment, and harm will be illustrated.
EBM APPROACH TO DIAGNOSIS
Our first clinical question was "In a patient with dystrophic toenails, should a KOH test or culture be done to establish a diagnosis of onychomycosis?" A suggested strategy to search the MEDLINE database for evidence about diagnosis is to combine the subject or subjects with a combination of terms as follows.26 (The terms used in the examples were specifically designed for use with OVID to search the MEDLINE database. They may have to be modified for use with other searching platforms. The use of OVID is not an endorsement. It is an available platform with which I have some facility.)
Subject(s):
1. onychomycosis Terms to use for maximum sensitivity of the search:
Terms to use for maximum sensitivity of the search:
2. sensitivity-and-specificity (MeSH) or
3. sensitivity (text word) or
4. diagnosis (subheading) or
5. diagnostic use (subheading) or
6. specificity (text word)
Terms to use for maximum specificity of the search:
7. explode sensitivity-and-specificity (MeSH) or
8. (predictive and value$) (text word)
If in a hurry, the best 1-term strategy:
9. sensitivity (text word)
For example, with the use of OVID to search the MEDLINE database, a specific search for tests to establish a diagnosis of onychomycosis would be [1 and (7 or 8)], a sensitive search [1 and (2 or 3 or 4 or 5 or 6)], and a quick search (1 and 9). The specific search yielded 10 references. Only 1 was relevant to the laboratory diagnosis of onychomycosis. However, it used a technique not readily available in most settings.40 The sensitive search yielded 133 references. Scanning the titles of the sensitive search yielded 4 potentially useful articles,41-44 1 of which was a report by Davies44 on the analysis of the results of examining 3995 samples of nails suspected of having Trichophyton rubrum infection during a therapeutic trial of griseofulvin. It appears to be an appropriate article to determine the utilities of the KOH test and culture in the diagnosis of onychomycosis. In my search for evidence it was the best source.
The criteria to critically appraise an article about a diagnostic test are shown in Table 1.19-20 Detailed explanation of each criteria is available.19-20 According to these criteria, the Davies article is not an ideal study.44 No criterion standard for the diagnosis of onychomycosis is defined, the spectrum of patients is not well described, and the KOH method is not the same as it is in most dermatologists' offices. Finally, only cultures that yielded T rubrum were considered positive. These major reservations notwithstanding, the data from examining 3995 nail samples are useful. If one defines the criterion standard for the diagnosis of onychomycosis as having a dystrophic nail and either a positive KOH test result or culture (and assume that a normal nail indicates absence of disease), then the results of KOH testing of nail samples are shown in Table 2. The results allow calculation of the likelihood ratio for a positive KOH test result in evaluating patients with dystrophic nails. The likelihood ratio is the percentage of people with the disease who have a positive test result divided by the percentage of people who do not have the disease who have a positive test result. The likelihood ratio is traditionally taught as the sensitivity divided by 1 minus the specificity. With these assumptions, the likelihood ratio of a positive KOH test result is 15. (If one defines the criterion standard for the diagnosis of onychomycosis as a dystrophic nail and a positive culture [and assumes that a normal nail indicates absence of disease], then the likelihood ratio of a positive KOH test result is 4.)
|
|
|
|
Table 1. Critical Appraisal of an Article About a Diagnostic Test*
|
|
|
|
|
|
|
Table 2. Results of Potassium Hydroxide Testing of Nail Samples*
|
|
|
For the likelihood ratio to be useful, one has to have an idea of how likely the disease is to be present before the test is done (ie, the pretest probability) and a sense of how certain one needs to be to conclude that the patient has the disease and to act on it. Whether formally or informally, physicians develop thresholds of certainty at or above which they are comfortable with establishing a diagnosis and acting on the diagnoses. Action may take the form of communicating the diagnosis or prognosis to the patient, prescribing treatment, or referring the patient. When historical and physical evidence leads a clinician to suspect a diagnosis but the degree of certainty does not exceed the threshold for establishing a diagnosis, a test is done to increase the probability that the disease is present above the clinician's threshold for action.45
Returning to our patient with dystrophic nails, how likely is he to have onychomycosis? Several pieces of evidence from the literature are helpful in this regard. From Davies' data,44 913 (46%) of 1198 hyperkeratotic or discolored nails showed fungus by microscopy or culture. Data from an outpatient-based, cross-sectional survey indicated that 133 (53%) of 252 patients with dystrophic nails had septate hyphae identified on direct microscopy with KOH and calcofluor.46 Thus, the pretest probability of onychomycosis in our patient is likely to be around this range.
Once the pretest probability is known or estimated and the likelihood ratio is determined, a nomogram can be used to estimate the posttest probability (Figure 1). In our case, the posttest probability of onychomycosis is 90% to 95%, a figure most would agree is sufficient to establish a diagnosis and act on it.
|
|
|
|
Nomogram for determining the posttest probability. To determine the posttest probability, draw a straight line through the pretest probability and the likelihood ratio and read the posttest probability on the right. From Sackett et al.18
|
|
|
Once you have made the effort to ask a well-built clinical question, find the best evidence, and critically appraise it, you should save your analysis of it in a place and format in which it can be easily retrieved for future use. Worksheets for recording evidence about articles dealing with diagnosis, therapy, prognosis, harm, and prevention are available (ftp://cebm.jr2.ox.ac.uk/pub/centre/docs/workshee.doc).18 These worksheets can be saved electronically or filed physically. Shelley Roaten, Jr, MD, wrote a program to assist in evaluating medical journal articles (ftp://141.2.61.3/MedArchiv/liteval3.zip). It includes the ability to recheck some of the author's statistical results. This program assists with the critical evaluation of original articles in medical journals, with a menu of 5 common article types. The program serves as a memory aid and produces a record of your observations. It is adapted primarily from Sackett et al.24
Since the Davies study44 is not ideal, it would be nice to have independent evidence to support our estimate of the likelihood ratio. Some corroboration is provided by results of a placebo-controlled trial of terbinafine in the treatment of onychomycosis.47 In this study, all patients had positive results of cultures and positive KOH test results on entry into treatment with terbinafine or placebo. At the conclusion of the study, 9% of placebo-treated patients had both negative results of cultures and negative KOH test results, 20% had negative KOH test results, and 30% had negative results of cultures. Assuming that patients with both negative results of cultures and negative KOH test results were spontaneously cured, then one can estimate that 11% of the negative KOH test results (20%-9%) and 21% of the negative results of cultures (30%-9%) were false negatives. Therefore, the sensitivity of the KOH test in this study was 89% (sensitivity=100%-% false negative), a figure similar to that obtained from the data of Davies (88%). Data from a multicenter clinical study of terbinafine indicated that 1707 (83%) of 2065 specimens from patients screened for a multicenter clinical study of terbinafine were positive by direct microscopic examination with KOH and calcofluor.48
This example illustrates one of the major challenges of practicing evidence-based dermatology. Many questions cannot be adequately addressed by available data. However, it also illustrates some important points. First, high-quality evidence about diagnostic tests in dermatology are not readily available in the literature. Sensitive search strategies and scanning long lists of articles will most likely be needed to find the best evidence. Second, since we must take care of patients on the basis of what we have, the best available, albeit imperfect, evidence must be used. Many valid objections can be raised to the choice of evidence and the calculations used in our example. However, they are corroborated by independent observations and serve as a reasonable or best approximation to the truth. Finally, not finding high-quality evidence serves to identify areas for badly needed clinical studies. (I have not exactly answered my clinical question in this illustration. To compare KOH and culture would involve determining the sensitivity, specificity, and likelihood ratio for each test, and determining the difference in their ability to diagnose onychomycosis. Readers are directed to books and Web sites listed at the end of this article for a discussion of comparing diagnostic tests.)
EBM APPROACH TO TREATMENT
The second clinical question was, "In a patient with onychomycosis, would treatment with terbinafine or itraconazole be more likely to lead to cured nails (ie, normal nails with negative results of culture and negative KOH test results)?" A suggested strategy to search the MEDLINE database for evidence about therapy is to combine the subject or subjects with a combination of terms as follows26:
Subjects
1. onychomycosis
2. terbinafine
3. itraconazole
For maximum sensitivity:
4. randomized-controlled-trial (publication type, limit) or
5. drug trial (subheading) or
6. therapeutic use (subheading) or
7. random$ (text word)
For maximum specificity:
8. double and blind$ (text word) or
9. placebo$ (text word)
Best 1-term strategy:
10. clinical-trial (publication type, limit)
For example, with the use of OVID, a specific search for therapy for onychomycosis with terbinafine or itraconazole would be [1 and (2 or 3) and (8 or 9)], a sensitive search would be [1 and (2 or 3) and (4 or 5 or 6 or 7)], and a quick search would be [1 and (2 or 3) and 10]. Therapeutic trials are the easiest evidence to find. Specific search strategies are likely to yield the best evidence most efficiently. Using the specific search strategy with OVID yielded 27 references, 18 of which were randomized controlled trials of terbinafine or itraconazole in the treatment of toenail onychomycosis. One of them, an article by De Backer et al49 comparing continuous terbinafine and itraconazole, will be used for illustration. It appears to be an appropriate article for comparing terbinafine and itraconazole in the treatment of onychomycosis. (The intent is to illustrate the EBM approach to an article about therapy. Systematic reviews and economic analyses of the newer oral treatments of onychomycosis are available to actually answer the question about the best treatment for toenail onychomycosis.50-53)
The criteria to critically appraise an article about therapy are shown in Table 3.21-22 Detailed explanation of each criteria is available.21-22 With these criteria, the De Backer et al article49 has some problems in their reporting of methods but does appear to have adequately concealed random allocation (a blocked design may have been used), complete follow-up, blinding, equal treatment, and similar treatment groups. In this trial, randomly allocated patients were treated with 200 mg of itraconazole or 250 mg of terbinafine daily for 12 weeks, and patients were followed up for 48 weeks.
|
|
|
|
Table 3. Critical Appraisal of an Article About Therapy*
|
|
|
In interpreting the results of clinical trials, what matters most is whether investigators have been able to detect a medically significant difference in treatments, how large the difference is likely to be, and how precise the investigators' estimate of the difference is.9, 54-56 The percentages of patients achieving clinically normal nails were 41% (76 of 186 patients) and 33% (61 of 186) for terbinafine and itraconazole, respectively. The difference in response rates was 8%, and the 95% confidence interval of the difference in response rates was -2% to 18% (Table 4). Interestingly, authors of studies of the treatment of onychomycosis rarely report the percentage of patients who achieve cured nails (normal nails and negative results of mycological tests).
|
|
|
|
Table 4. Rates of Response to Treatment*
|
|
|
Reporting trial results with confidence intervals is an alternative or complement to presenting the results with statistical significance testing.9 The calculation and interpretation of confidence intervals has been extensively described.9, 18, 22, 24, 54, 57-62 In simple terms, the reported result provides the best estimate of the treatment effect and the confidence interval provides a measure of the precision of the estimate.21-22,54, 63 The confidence interval provides a range of values in which the "population" or true response to treatment lies.21-22,54, 63 For example, the population or true response to treatment has only a 1 in 20 chance of being outside of the 95% confidence interval. Alternatively, if the trial is repeated many times, 95% of the confidence intervals produced will contain the true or population mean response to treatment. The true response rate will most likely lie near the middle of the confidence interval and will rarely be found at or near the ends of the interval. If the 95% confidence interval of the difference in response rates excludes the zero difference, one can reject the null hypothesis that the 2 treatments are the same.22
In our example, the 95% confidence interval of the difference in response rates does not exclude the zero difference; therefore, we cannot reject the null hypothesis that the 2 treatments are the same. The difference in response rates is most likely to be 8% but may be as low as -2% (favoring itraconazole) or as high as 18%. Note that the upper boundary of the difference in cure rates was 18%. This difference would clearly indicate a significant treatment advantage for terbinafine, and therefore a significant treatment advantage of terbinafine over itraconazole may have been missed in this study.
The number needed to treat (NNT) is a useful way to express a medically significant difference in treatments that is commonly used in EBM.18, 21-22,24, 64 The NNT is the reciprocal of the difference in response rates. The NNT represents the number of patients one would need to treat to achieve 1 additional cure. The NNT of terbinafine vs itraconazole in this study was 13 (1/0.08) (NNTs are always rounded up). That is, for every 13 patients treated with terbinafine instead of itraconazole, 1 additional patient would have clinically normal nails. The 95% confidence interval of the NNT is 6 to -59 (Table 4). This interval means that the NNT may be as low as 6 but may go to infinity. The NNT of itraconazole vs terbinafine (ie, favoring itraconazole) may be as low as 59 but may go to infinity.
The De Backer et al study was republished with more complete reporting of methods and inclusion of data on patients who achieved total cure (normal nails and negative mycological findings).65 A block randomization design was used and randomization was concealed. An intent-to-treat analysis was performed. The groups were similar at the start of the trial and were treated equally. The percentage of patients who achieved total cure was 38% and 23% for terbinafine and itraconazole, respectively. The difference in cure rates was 14% (95% confidence interval, 5% to 24%). The NNT was 7 (95% confidence interval, 5 to 20).
A stand-alone program (CATmaker) is available to help appraise clinical trials of therapy. It allows entry of well-built clinical questions, the evidence found, the quality features of the trials, and the results of the treatment arms and performs confidence interval and NNT calculations. The resultant clinically appraised topic can be saved as a text file. (To obtain a copy of the CATmaker program to take part in the testing, please write to Douglas Badenoch, NHS R&D Centre for Evidence-Based Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Level 5, The John Radcliffe, Headley Way, Headington, Oxford OX3 9DU, England.)
EBM APPROACH TO HARM
The final clinical question was, "In an otherwise healthy, middle-aged, male patient, how frequently does terbinafine or itraconazole cause clinically significant liver disease?" A suggested strategy for searching the MEDLINE database for evidence about harm is to combine the subject or subjects with a combination of terms as follows26:
Subjects
1. terbinafine
2. itraconazole
For maximum sensitivity:
3. explode cohort-studies (MeSH) or
4. explode risk (MeSH) or
5. odds and ratio$ (text word) or
6. relative and risk (text word) or
7. case and control$ (text word)
For maximum specificity:
8. case-control-studies (MeSH) or
9. cohort-studies (MeSH)
Best 1-term strategy:
10. risk (text word)
For example, with the use of OVID, a specific search for adverse effects of terbinafine or itraconazole would be [(1 or 2) and (8 or 9)], a sensitive search would be [(1 or 2) and (3 or 4 or 5 or 6 or 7)], and a quick search would be [(1 or 2) and 10]. High-quality case-control or cohort studies are not commonly found in the dermatological literature. Sensitive search strategies and scanning the long lists of articles retrieved are likely to yield the best evidence most efficiently.
The criteria to critically appraise an article about harm are shown in Table 5.23 Detailed explanation of each criterion is available.23 No case-control or cohort studies of the drugs were detected among 9 articles yielded by means of the maximum-specificity search. To illustrate the idiosyncrasies of MEDLINE searching, the specific search detected 1 postmarketing surveillance study of terbinafine of 10,361 patients66 but not another of 25,884 patients.67
|
|
|
|
Table 5. Critical Appraisal of an Article About Harm*
|
|
|
Scanning the 195 articles yielded by the sensitive search showed that the best available evidence concerning the toxic effects of terbinafine was the postmarketing surveillance study of 25,884 patients.67 The study group consisted of patients in the United Kingdom, Austria, Germany, and the Netherlands who were treated, enrolled, and followed up by their own physicians. There was no control group. Eighty-nine percent of patients had no adverse events, 4.6% had gastrointestinal tract events (nausea, diarrhea, pain, and dyspepsia), and 2.3% had dermatological events (rash, pruritus, urticaria, and "eczema"). Serious adverse events judged to be possibly or probably related to terbinafine were erythroderma in a patient with a history of psoriasis and erythema multiforme in another patient. Serious cholestatic jaundice developed in 1 patient, and 1 patient developed neutropenia. This postmarketing surveillance study suggests that serious adverse events occur infrequently among patients taking terbinafine. The study can be criticized for a poor study design that may have underestimated the frequency of adverse events.
The evidence from the search regarding adverse effects of itraconazole was limited to review articles, economic analyses, clinical trials, and systematic reviews.50-53,68-73 In clinical trials, the average dropout rate was approximately 4%. The most common symptoms associated with itraconazole therapy were gastrointestinal tract symptoms, dizziness, pruritus, and headache. Transient elevation of aminotransferase levels (greater than twice normal values) occurred in 4% of patients receiving continuous therapy for onychomycosis.50-53
Controlled clinical trials that typically enroll small numbers of patients are often not adequate for determining the long-term toxic effects of newly introduced treatments.23, 74 If no untoward reactions occur in a study, it is possible to determine the 95% confidence interval of adverse reactions. Accurate upper limits of the reaction rate can be obtained by referring to available tables.60, 75-76 A reasonable approximation of the rate of adverse reactions in this situation can be obtained by dividing 3 by the number of patients who were exposed to the studied treatment.60, 76-77
Returning to our hypothetical patient, his pretest probability of having onychomycosis is about 50%.44, 46 Obtaining a positive result on the KOH, which is a quick, easy, and inexpensive test, would increase the probability to 90% to 95%.19-20,46, 48 If the KOH test result is negative, options might include repeating the test, doing a culture, or histologically examining nail clippings.41-44,48 The evidence for each of these approaches can be analyzed by evidence-based methods.19-20 One randomized, double-blind, controlled trial indicates that terbinafine is the best treatment of toenail onychomycosis for the goal of producing a normal nail with negative mycological findings.65 This goal was achieved by 38% of terbinafine-treated patients. If one is willing to accept "improvement" in the appearance of the nail and negative mycological findings, continuous terbinafine is the most effective treatment.50-52,78-81 Based on postmarketing surveillance studies and clinical trials, the incidence of clinically significant liver disease is low with terbinafine or itraconazole.50-51,53, 67, 73 However, high-quality case-control and cohort studies have not been performed.
CONCLUSIONS
|