User login
Interpretation of confidence intervals
The question “How statistically significant are the results of a study?” can be answered in several ways. For example, the difference in mean blood-pressure lowering effect between a new drug and placebo from a specific study can be expressed as:
Difference=8.5 mm Hg; P=.03
However, the true effect of the drug is unlikely to be exactly 8.5 mm Hg—if the study were repeated, the results would be somewhat different. The variation is not explained by the P value. The P value is prone to misinterpretation, as will be explained in a future Language of Evidence. It tells us only that a difference is significant; it says nothing about its magnitude or precision.
Confidence intervals are a better alternative. Consider the difference above expressed as a confidence interval:
Difference=8.5 mm Hg; 95% CI, 6.3–10.8 or, graphically:
The probability of obtaining a result of 8.5 mm Hg, assuming the true value is not within the interval of 6.3–10.8 mm Hg, is 5% or less. In other words, one wouldn’t get the result above very often if the true value is outside the interval. This interpretation seems convoluted. It is easier to think of it this way: “The true value for the difference is most likely between 6.3 and 10.8. It is unlikely to be outside of this interval.”
3 Important principles
- The wider a confidence interval, the less precise the estimate. Precision depends upon sample size. Therefore, the larger the sample size of a study, the narrower the confidence interval and the better the estimate. In the example above, a similar study with a much larger sample size may yield a narrower difference: 8.5 mm Hg; 95% CI, 7.4–8.9.
- The 90% (or lower) confidence interval for an estimate is narrower than the 95% confidence interval; a 99% confidence interval is wider. This makes sense, since we are surer that a true value lies between 2 widely separated numbers than 2 more narrowly separated numbers.
- If the confidence interval includes “no difference”(in this case, zero change in blood pressure), the corresponding Pvalue must be >.05. “No difference” for a result expressed as a subtraction is zero; for a result expressed as a ratio (such as an absolute difference or a relative risk), it would be 1. It is important to keep this difference in mind when interpreting confidence intervals.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
The question “How statistically significant are the results of a study?” can be answered in several ways. For example, the difference in mean blood-pressure lowering effect between a new drug and placebo from a specific study can be expressed as:
Difference=8.5 mm Hg; P=.03
However, the true effect of the drug is unlikely to be exactly 8.5 mm Hg—if the study were repeated, the results would be somewhat different. The variation is not explained by the P value. The P value is prone to misinterpretation, as will be explained in a future Language of Evidence. It tells us only that a difference is significant; it says nothing about its magnitude or precision.
Confidence intervals are a better alternative. Consider the difference above expressed as a confidence interval:
Difference=8.5 mm Hg; 95% CI, 6.3–10.8 or, graphically:
The probability of obtaining a result of 8.5 mm Hg, assuming the true value is not within the interval of 6.3–10.8 mm Hg, is 5% or less. In other words, one wouldn’t get the result above very often if the true value is outside the interval. This interpretation seems convoluted. It is easier to think of it this way: “The true value for the difference is most likely between 6.3 and 10.8. It is unlikely to be outside of this interval.”
3 Important principles
- The wider a confidence interval, the less precise the estimate. Precision depends upon sample size. Therefore, the larger the sample size of a study, the narrower the confidence interval and the better the estimate. In the example above, a similar study with a much larger sample size may yield a narrower difference: 8.5 mm Hg; 95% CI, 7.4–8.9.
- The 90% (or lower) confidence interval for an estimate is narrower than the 95% confidence interval; a 99% confidence interval is wider. This makes sense, since we are surer that a true value lies between 2 widely separated numbers than 2 more narrowly separated numbers.
- If the confidence interval includes “no difference”(in this case, zero change in blood pressure), the corresponding Pvalue must be >.05. “No difference” for a result expressed as a subtraction is zero; for a result expressed as a ratio (such as an absolute difference or a relative risk), it would be 1. It is important to keep this difference in mind when interpreting confidence intervals.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
The question “How statistically significant are the results of a study?” can be answered in several ways. For example, the difference in mean blood-pressure lowering effect between a new drug and placebo from a specific study can be expressed as:
Difference=8.5 mm Hg; P=.03
However, the true effect of the drug is unlikely to be exactly 8.5 mm Hg—if the study were repeated, the results would be somewhat different. The variation is not explained by the P value. The P value is prone to misinterpretation, as will be explained in a future Language of Evidence. It tells us only that a difference is significant; it says nothing about its magnitude or precision.
Confidence intervals are a better alternative. Consider the difference above expressed as a confidence interval:
Difference=8.5 mm Hg; 95% CI, 6.3–10.8 or, graphically:
The probability of obtaining a result of 8.5 mm Hg, assuming the true value is not within the interval of 6.3–10.8 mm Hg, is 5% or less. In other words, one wouldn’t get the result above very often if the true value is outside the interval. This interpretation seems convoluted. It is easier to think of it this way: “The true value for the difference is most likely between 6.3 and 10.8. It is unlikely to be outside of this interval.”
3 Important principles
- The wider a confidence interval, the less precise the estimate. Precision depends upon sample size. Therefore, the larger the sample size of a study, the narrower the confidence interval and the better the estimate. In the example above, a similar study with a much larger sample size may yield a narrower difference: 8.5 mm Hg; 95% CI, 7.4–8.9.
- The 90% (or lower) confidence interval for an estimate is narrower than the 95% confidence interval; a 99% confidence interval is wider. This makes sense, since we are surer that a true value lies between 2 widely separated numbers than 2 more narrowly separated numbers.
- If the confidence interval includes “no difference”(in this case, zero change in blood pressure), the corresponding Pvalue must be >.05. “No difference” for a result expressed as a subtraction is zero; for a result expressed as a ratio (such as an absolute difference or a relative risk), it would be 1. It is important to keep this difference in mind when interpreting confidence intervals.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
Number needed to treat
Hundreds of relevant studies of new or existing therapies are published each year. Interpreting the results in a way that is useful to both you and your patients is an important skill.
Consider the recently published Heart Protection Study,1 which assessed the effect of simvastatin on specific cardiovascular outcomes and mortality by comparing it with placebo among 20,536 adults with pre-existing cardiovascular disease. Table 1 summarizes the effect of simvastatin, 40 mg once daily, on all-cause mortality after 5 years.
The proportion of patients in the simvastatin group who died was 1328/10,269 or 0.129 (12.9%); the proportion of the placebo group who died was 1507/10,267 or 0.147 (14.7%). The evidence suggests simvastatin is superior in reducing mortality. But how significant is the difference?
One way to translate these results into a more useful form is to determine the number needed to treat (NNT). The NNT in this case refers to the number of people one would need to treat with simvastatin to prevent 1 death. The first step is to determine the absolute risk reduction (ARR), which is simply the difference in the proportion of outcomes in the two treatment groups. In this case: ARR = 0.147 – 0.129 = 0.018.
The NNT is simply the inverse of the ARR. In this case NNT = 1/0.018 = 56. Therefore, 56 people with cardiovascular disease need to be treated with simvastatin to prevent 1 death in 5 years.
Is this reasonable? There is no absolutely correct answer. An appropriate NNT depends on the risks and benefits of treatment. A higher NNT is tolerable even with significant adverse effects if the treatment prevents a serious outcome such as heart disease or death. Migraine, by contrast, is not life-threatening. Treating 56 migraine sufferers to cure a single headache is unreasonable. The NNT for treatment of migraine with subcutaneous sumatriptan vs placebo is about 2.2
TABLE 1
Effects of simvastatin on all-cause mortality
Treatment | Patients | Deaths in 5 years |
---|---|---|
Simvastatin 40 mg | 10,269 | 1328 |
Placebo | 10,267 | 1507 |
1. Collins R, Armitage J, Parish S, Sleigh P, Peto R. MRC/BHF Heart Protection Study of cholesterol-lowering with simvastatin in 5963 people with diabetes: a randomised placebo-controlled trial. Lancet 2003;361:2005-2006.
2. Sumatriptan in acute migraine. Available at: www.jr2.ox.ac.uk/bandolier/booth/Migraine/SumaTH.html. Accessed on July 27, 2003.
Correspondence: Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
Hundreds of relevant studies of new or existing therapies are published each year. Interpreting the results in a way that is useful to both you and your patients is an important skill.
Consider the recently published Heart Protection Study,1 which assessed the effect of simvastatin on specific cardiovascular outcomes and mortality by comparing it with placebo among 20,536 adults with pre-existing cardiovascular disease. Table 1 summarizes the effect of simvastatin, 40 mg once daily, on all-cause mortality after 5 years.
The proportion of patients in the simvastatin group who died was 1328/10,269 or 0.129 (12.9%); the proportion of the placebo group who died was 1507/10,267 or 0.147 (14.7%). The evidence suggests simvastatin is superior in reducing mortality. But how significant is the difference?
One way to translate these results into a more useful form is to determine the number needed to treat (NNT). The NNT in this case refers to the number of people one would need to treat with simvastatin to prevent 1 death. The first step is to determine the absolute risk reduction (ARR), which is simply the difference in the proportion of outcomes in the two treatment groups. In this case: ARR = 0.147 – 0.129 = 0.018.
The NNT is simply the inverse of the ARR. In this case NNT = 1/0.018 = 56. Therefore, 56 people with cardiovascular disease need to be treated with simvastatin to prevent 1 death in 5 years.
Is this reasonable? There is no absolutely correct answer. An appropriate NNT depends on the risks and benefits of treatment. A higher NNT is tolerable even with significant adverse effects if the treatment prevents a serious outcome such as heart disease or death. Migraine, by contrast, is not life-threatening. Treating 56 migraine sufferers to cure a single headache is unreasonable. The NNT for treatment of migraine with subcutaneous sumatriptan vs placebo is about 2.2
TABLE 1
Effects of simvastatin on all-cause mortality
Treatment | Patients | Deaths in 5 years |
---|---|---|
Simvastatin 40 mg | 10,269 | 1328 |
Placebo | 10,267 | 1507 |
Hundreds of relevant studies of new or existing therapies are published each year. Interpreting the results in a way that is useful to both you and your patients is an important skill.
Consider the recently published Heart Protection Study,1 which assessed the effect of simvastatin on specific cardiovascular outcomes and mortality by comparing it with placebo among 20,536 adults with pre-existing cardiovascular disease. Table 1 summarizes the effect of simvastatin, 40 mg once daily, on all-cause mortality after 5 years.
The proportion of patients in the simvastatin group who died was 1328/10,269 or 0.129 (12.9%); the proportion of the placebo group who died was 1507/10,267 or 0.147 (14.7%). The evidence suggests simvastatin is superior in reducing mortality. But how significant is the difference?
One way to translate these results into a more useful form is to determine the number needed to treat (NNT). The NNT in this case refers to the number of people one would need to treat with simvastatin to prevent 1 death. The first step is to determine the absolute risk reduction (ARR), which is simply the difference in the proportion of outcomes in the two treatment groups. In this case: ARR = 0.147 – 0.129 = 0.018.
The NNT is simply the inverse of the ARR. In this case NNT = 1/0.018 = 56. Therefore, 56 people with cardiovascular disease need to be treated with simvastatin to prevent 1 death in 5 years.
Is this reasonable? There is no absolutely correct answer. An appropriate NNT depends on the risks and benefits of treatment. A higher NNT is tolerable even with significant adverse effects if the treatment prevents a serious outcome such as heart disease or death. Migraine, by contrast, is not life-threatening. Treating 56 migraine sufferers to cure a single headache is unreasonable. The NNT for treatment of migraine with subcutaneous sumatriptan vs placebo is about 2.2
TABLE 1
Effects of simvastatin on all-cause mortality
Treatment | Patients | Deaths in 5 years |
---|---|---|
Simvastatin 40 mg | 10,269 | 1328 |
Placebo | 10,267 | 1507 |
1. Collins R, Armitage J, Parish S, Sleigh P, Peto R. MRC/BHF Heart Protection Study of cholesterol-lowering with simvastatin in 5963 people with diabetes: a randomised placebo-controlled trial. Lancet 2003;361:2005-2006.
2. Sumatriptan in acute migraine. Available at: www.jr2.ox.ac.uk/bandolier/booth/Migraine/SumaTH.html. Accessed on July 27, 2003.
Correspondence: Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
1. Collins R, Armitage J, Parish S, Sleigh P, Peto R. MRC/BHF Heart Protection Study of cholesterol-lowering with simvastatin in 5963 people with diabetes: a randomised placebo-controlled trial. Lancet 2003;361:2005-2006.
2. Sumatriptan in acute migraine. Available at: www.jr2.ox.ac.uk/bandolier/booth/Migraine/SumaTH.html. Accessed on July 27, 2003.
Correspondence: Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
What is an ROC curve?
Receiver-operating characteristic (ROC) curves were developed to assess the quality of radar. In medicine, ROC curves are a way to analyze the accuracy of diagnostic tests and to determine the best threshold or “cutoff” value for distinguishing between positive and negative test results.
Diagnostic testing is almost always a tradeoff between sensitivity and specificity. ROC curves provide a graphic representation of this tradeoff. Setting a cutoff value too low may yield a very high sensitivity (ie, no disease would be missed) but at the expense of specificity (ie, a lot of false-positive results). Setting a cutoff too high would yield high specificity at the expense of sensitivity.
Consider a study by Smith et al,1 who measured the accuracy of B-type natriuretic peptide (BNP) as a test for impaired left ventricular function. They measured BNP levels in 155 elderly patients, who also underwent echocardiography (the diagnostic gold standard). An ROC curve was created by plotting the sensitivity against 1–specificity for different cutoff values of BNP (Figure). For example, the sensitivity and specificity of the BNP test were calculated and plotted, assuming a level of 19.8 pmol/L as a cutoff for a positive test.
The best cutoff has the highest sensitivity and lowest 1–specificity, and is therefore located as high up on the vertical axis and as far left on the horizontal axis as possible (upper left corner). The area under an ROC curve is a measure of the usefulness or “discriminative ” value of a test in general. The greater the area, the more useful the test. The maximum possible area under the curve is simply a perfect square and has an area of 1.0. Smith et al’s1 curve has an area of 0.85. The diagonal 45° line represents a test that has no discriminative value—ie, it’s completely useless.
FIGURE 1
Sample ROC curve
An ROC curve for the accuracy of B-type natriuretic peptide (BNP) as a test for impaired left ventricular function, which plots the sensitivity against 1–specificity for different cutoff values of BNP.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
REFERENCE
1. Smith H, Pickering RM, Struthers A, Simpson I, Mant D. Biochemical diagnosis of ventricular dysfunction in elderly patients in general practice: an observational study. BMJ 2000;320:906-908.
Receiver-operating characteristic (ROC) curves were developed to assess the quality of radar. In medicine, ROC curves are a way to analyze the accuracy of diagnostic tests and to determine the best threshold or “cutoff” value for distinguishing between positive and negative test results.
Diagnostic testing is almost always a tradeoff between sensitivity and specificity. ROC curves provide a graphic representation of this tradeoff. Setting a cutoff value too low may yield a very high sensitivity (ie, no disease would be missed) but at the expense of specificity (ie, a lot of false-positive results). Setting a cutoff too high would yield high specificity at the expense of sensitivity.
Consider a study by Smith et al,1 who measured the accuracy of B-type natriuretic peptide (BNP) as a test for impaired left ventricular function. They measured BNP levels in 155 elderly patients, who also underwent echocardiography (the diagnostic gold standard). An ROC curve was created by plotting the sensitivity against 1–specificity for different cutoff values of BNP (Figure). For example, the sensitivity and specificity of the BNP test were calculated and plotted, assuming a level of 19.8 pmol/L as a cutoff for a positive test.
The best cutoff has the highest sensitivity and lowest 1–specificity, and is therefore located as high up on the vertical axis and as far left on the horizontal axis as possible (upper left corner). The area under an ROC curve is a measure of the usefulness or “discriminative ” value of a test in general. The greater the area, the more useful the test. The maximum possible area under the curve is simply a perfect square and has an area of 1.0. Smith et al’s1 curve has an area of 0.85. The diagonal 45° line represents a test that has no discriminative value—ie, it’s completely useless.
FIGURE 1
Sample ROC curve
An ROC curve for the accuracy of B-type natriuretic peptide (BNP) as a test for impaired left ventricular function, which plots the sensitivity against 1–specificity for different cutoff values of BNP.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
Receiver-operating characteristic (ROC) curves were developed to assess the quality of radar. In medicine, ROC curves are a way to analyze the accuracy of diagnostic tests and to determine the best threshold or “cutoff” value for distinguishing between positive and negative test results.
Diagnostic testing is almost always a tradeoff between sensitivity and specificity. ROC curves provide a graphic representation of this tradeoff. Setting a cutoff value too low may yield a very high sensitivity (ie, no disease would be missed) but at the expense of specificity (ie, a lot of false-positive results). Setting a cutoff too high would yield high specificity at the expense of sensitivity.
Consider a study by Smith et al,1 who measured the accuracy of B-type natriuretic peptide (BNP) as a test for impaired left ventricular function. They measured BNP levels in 155 elderly patients, who also underwent echocardiography (the diagnostic gold standard). An ROC curve was created by plotting the sensitivity against 1–specificity for different cutoff values of BNP (Figure). For example, the sensitivity and specificity of the BNP test were calculated and plotted, assuming a level of 19.8 pmol/L as a cutoff for a positive test.
The best cutoff has the highest sensitivity and lowest 1–specificity, and is therefore located as high up on the vertical axis and as far left on the horizontal axis as possible (upper left corner). The area under an ROC curve is a measure of the usefulness or “discriminative ” value of a test in general. The greater the area, the more useful the test. The maximum possible area under the curve is simply a perfect square and has an area of 1.0. Smith et al’s1 curve has an area of 0.85. The diagonal 45° line represents a test that has no discriminative value—ie, it’s completely useless.
FIGURE 1
Sample ROC curve
An ROC curve for the accuracy of B-type natriuretic peptide (BNP) as a test for impaired left ventricular function, which plots the sensitivity against 1–specificity for different cutoff values of BNP.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: raog@msx.upmc.edu.
REFERENCE
1. Smith H, Pickering RM, Struthers A, Simpson I, Mant D. Biochemical diagnosis of ventricular dysfunction in elderly patients in general practice: an observational study. BMJ 2000;320:906-908.
REFERENCE
1. Smith H, Pickering RM, Struthers A, Simpson I, Mant D. Biochemical diagnosis of ventricular dysfunction in elderly patients in general practice: an observational study. BMJ 2000;320:906-908.
Does using nonsteroidal anti-inflammatory drugs (NSAIDs) during pregnancy increase the risk of adverse events?
BACKGROUND: The safety of NSAIDs is not well documented, even through they are often used during pregnancy. The authors of this study estimated the risk of adverse pregnancy outcomes by examining NSAID prescription use in a large population of Danish women.
POPULATION STUDIED: The researchers assembled 2 separate study groups drawn from one county in Denmark. In one group 1462 pregnant women who had filled prescriptions for NSAIDs were compared with 17,259 controls to assess unfavorable birth outcomes. Those who had filled NSAID prescriptions anywhere from 30 days before conception up to the date of delivery were included. In the other group 4268 women who had first miscarriages were compared with 29,750 primiparous women with live births as controls. Patient information for both study groups was identified using a prescription registry, the Danish birth registry, and the county’s hospital discharge registry.
STUDY DESIGN AND VALIDITY: The incidence of adverse birth outcomes was assessed through a retrospective cohort design. The risk of miscarriage was determined by a case-control study. Information was collected for each subject on the number of NSAID prescriptions filled (specifically ibuprofen 400 or 600 mg), maternal age, smoking status, gravity, parity, gestational age at delivery, and size of the neonate. A subset of prescription data was verified by examining physician and hospital records.
OUTCOMES MEASURED: Primary outcomes included congenital abnormality (type not specified), low birth weight (less than 2500 g), preterm birth (<37 weeks’ gestation), and miscarriage.
RESULTS: Congenital abnormality, low birth weight, and preterm birth incidence were not higher in offspring of women who had taken a NSAID during pregnancy. Miscarriages were significantly higher in women who had filled a prescription for an NSAID the week before miscarriage (odds ratio=6.99; 95% confidence interval, 2.75-17.74). Miscarriage was not associated with prescriptions filled 10 to 12 weeks before the date of miscarriage.
This study contributes valuable information for physicians and their pregnant patients contemplating use of NSAIDs. Women who have used NSAIDs before or during pregnancy may be reassured that there is no evidence of increased risk of congenital abnormality, low birth weight, or preterm birth. Also, women contemplating pregnancy should be warned about the association of miscarriage with NSAIDs. It seems prudent for women with a history of recurrent miscarriage to avoid NSAIDs.
BACKGROUND: The safety of NSAIDs is not well documented, even through they are often used during pregnancy. The authors of this study estimated the risk of adverse pregnancy outcomes by examining NSAID prescription use in a large population of Danish women.
POPULATION STUDIED: The researchers assembled 2 separate study groups drawn from one county in Denmark. In one group 1462 pregnant women who had filled prescriptions for NSAIDs were compared with 17,259 controls to assess unfavorable birth outcomes. Those who had filled NSAID prescriptions anywhere from 30 days before conception up to the date of delivery were included. In the other group 4268 women who had first miscarriages were compared with 29,750 primiparous women with live births as controls. Patient information for both study groups was identified using a prescription registry, the Danish birth registry, and the county’s hospital discharge registry.
STUDY DESIGN AND VALIDITY: The incidence of adverse birth outcomes was assessed through a retrospective cohort design. The risk of miscarriage was determined by a case-control study. Information was collected for each subject on the number of NSAID prescriptions filled (specifically ibuprofen 400 or 600 mg), maternal age, smoking status, gravity, parity, gestational age at delivery, and size of the neonate. A subset of prescription data was verified by examining physician and hospital records.
OUTCOMES MEASURED: Primary outcomes included congenital abnormality (type not specified), low birth weight (less than 2500 g), preterm birth (<37 weeks’ gestation), and miscarriage.
RESULTS: Congenital abnormality, low birth weight, and preterm birth incidence were not higher in offspring of women who had taken a NSAID during pregnancy. Miscarriages were significantly higher in women who had filled a prescription for an NSAID the week before miscarriage (odds ratio=6.99; 95% confidence interval, 2.75-17.74). Miscarriage was not associated with prescriptions filled 10 to 12 weeks before the date of miscarriage.
This study contributes valuable information for physicians and their pregnant patients contemplating use of NSAIDs. Women who have used NSAIDs before or during pregnancy may be reassured that there is no evidence of increased risk of congenital abnormality, low birth weight, or preterm birth. Also, women contemplating pregnancy should be warned about the association of miscarriage with NSAIDs. It seems prudent for women with a history of recurrent miscarriage to avoid NSAIDs.
BACKGROUND: The safety of NSAIDs is not well documented, even through they are often used during pregnancy. The authors of this study estimated the risk of adverse pregnancy outcomes by examining NSAID prescription use in a large population of Danish women.
POPULATION STUDIED: The researchers assembled 2 separate study groups drawn from one county in Denmark. In one group 1462 pregnant women who had filled prescriptions for NSAIDs were compared with 17,259 controls to assess unfavorable birth outcomes. Those who had filled NSAID prescriptions anywhere from 30 days before conception up to the date of delivery were included. In the other group 4268 women who had first miscarriages were compared with 29,750 primiparous women with live births as controls. Patient information for both study groups was identified using a prescription registry, the Danish birth registry, and the county’s hospital discharge registry.
STUDY DESIGN AND VALIDITY: The incidence of adverse birth outcomes was assessed through a retrospective cohort design. The risk of miscarriage was determined by a case-control study. Information was collected for each subject on the number of NSAID prescriptions filled (specifically ibuprofen 400 or 600 mg), maternal age, smoking status, gravity, parity, gestational age at delivery, and size of the neonate. A subset of prescription data was verified by examining physician and hospital records.
OUTCOMES MEASURED: Primary outcomes included congenital abnormality (type not specified), low birth weight (less than 2500 g), preterm birth (<37 weeks’ gestation), and miscarriage.
RESULTS: Congenital abnormality, low birth weight, and preterm birth incidence were not higher in offspring of women who had taken a NSAID during pregnancy. Miscarriages were significantly higher in women who had filled a prescription for an NSAID the week before miscarriage (odds ratio=6.99; 95% confidence interval, 2.75-17.74). Miscarriage was not associated with prescriptions filled 10 to 12 weeks before the date of miscarriage.
This study contributes valuable information for physicians and their pregnant patients contemplating use of NSAIDs. Women who have used NSAIDs before or during pregnancy may be reassured that there is no evidence of increased risk of congenital abnormality, low birth weight, or preterm birth. Also, women contemplating pregnancy should be warned about the association of miscarriage with NSAIDs. It seems prudent for women with a history of recurrent miscarriage to avoid NSAIDs.
Diagnostic Yield of Screening for Type 2 Diabetes in High-Risk Patients A Systematic Review
SEARCH STRATEGIES: The MEDLINE and EMBASE electronic databases were searched for original studies of screening for type 2 diabetes on the basis of risk factors. The reference lists of all reviews, letters, editorials, consensus statements, and guidelines for diabetes screening were searched for additional studies. The Cochrane database was also searched for relevant reviews.
SELECTION CRITERIA: All original studies regarding selective serum screening for type 2 diabetes on the basis of risk factors were included.
MAIN RESULTS: Seven studies were selected for review. Three studies were cross-sectional in design; 3 employed survey data to develop computerized statistical models that used risk factors to identify cases of type 2 diabetes; and 1 used a similar method, but the resulting model was field tested in a separate population. No study describes a risk-factor-based method or instrument that helps substantially in the diagnosis of type 2 diabetes.
CONCLUSIONS: Selective screening for type 2 diabetes on the basis of risk factors cannot be recommended. Serum screening can be offered to patients who present with typical symptoms of diabetes.
Diabetes is responsible for half of all nontraumatic amputations, 15% of blindness, and more than a third of all end-stage renal disease. The costs attributed to this disease total more than $100 billion annually.1
Identifying new patients with diabetes has always been a challenge. The third National Health and Nutrition Examination Survey (NHANES III)2 revealed that roughly 35% of the people with type 2 diabetes remains undiagnosed. This suggests that universal screening might be prudent. Several studies, however, have shown that the yield from universal serum screening in specific populations is low.3-5 Because of this low yield and because the risk factors for type 2 diabetes are well known Table 1, selective serum screening on the basis of risk factors is widely recommended.6-10
The goal of this paper was to determine the usefulness of the assessment of risk factors as a tool to decide who should undergo serum screening. Different investigators and organizations use slightly different serum tests and definitions to rule out or confirm the presence of type 2 diabetes. This paper does not address the performance of different blood tests (eg, fasting blood glucose, oral glucose tolerance test) or different definitions of type 2 diabetes.
Methods
The MEDLINE and EMBASE electronic databases were searched for the years 1966 to 1998, using the medical subject headings “diabetes mellitus” and “mass screening.” The search was then limited to English language papers dealing with human subjects. Resulting sets were combined. Titles of all papers in the combined set for each database were surveyed. Original studies, regardless of design, that deal specifically with the use of risk factors to screen and identify patients with type 2 diabetes were included. Reference lists of all publications including original studies, letters, commentaries, guidelines, and reviews were surveyed for relevant original studies. The Cochrane Database of Systematic Reviews was searched for relevant reviews.
Selected studies were assessed for validity using the “Users’ Guide to the Medical Literature.”11 The results of all studies were reviewed, regardless of shortcomings in validity Table 2.
Results
A total of 346 citations were retrieved from MEDLINE by combining the sets for “diabetes mellitus” and “mass screening” and limiting the search to English language papers dealing with human subjects. Review of the titles revealed 23 papers that specifically addressed the strategy of serum screening on the basis of risk factors. All 23 were retrieved; 7 were original studies. Searches of the EMBASE database, The Cochrane Database of Systematic Reviews, and reference lists of all papers did not yield additional studies.
Three of the 7 studies were cross-sectional in design. Three other studies used data from health surveys to develop computerized models designed to predict the presence of diabetes on the basis of risk factors. The final study used a similar design, but the resulting model was then tested in a population different from that in which it was developed.
Four of the studies used a combination of risk factors and symptoms of diabetes (eg, polyuria). The use of symptoms in combination with risk factors does not precisely fit the question addressed by this paper. So few studies address the use of risk factors alone, however, that these studies were also included. Table 3 provides a summary of the results.
Cross-Sectional Studies
Duncan, Linville, and Clement12 measured the risk factors and blood glucose levels of 575 self-selected participants in a screening program to test the strategy recommended by the American Diabetes Association (ADA)6 of screening only those patients with 1 or more risk factors for diabetes. It is unknown whether blood glucose levels were measured by independent blind investigators without knowledge of each patient’s risk factors. The authors describe blood glucose test results only as “normal” or “abnormal.” It is not specified whether fasting or random blood glucose levels were used and what level was considered abnormal or diagnostic of diabetes. The demographic characteristics of the patients were not described. Blood glucose testing was done on all subjects, regardless of whether risk factors were present. Only patients with abnormal results were followed up to determine which of them were eventually diagnosed with type 2 diabetes. The validity of this study, therefore, is questionable, as the standards used for abnormal and normal blood glucose levels and the characteristics of the patients studied are unclear.
Among 575 screening participants, 383 had 1 or more risk factors. Fifty-one glucose measurements were considered abnormal: 16 in patients without risk factors and 35 in patients with 1 or more risk factors. The likelihood ratio (LR) for a positive or “at risk” questionnaire (LR+) was 1.05; for a negative questionnaire (LR-), it was 0.93. Follow-up of only those patients with abnormal blood glucose results over 1.8 years revealed that 21 (41%) had confirmed diabetes: 15 with one or more risk factors and 6 with none. Performing blood glucose testing in only patients with 1 or more risk factors would have missed at least 6 of 21 or 29% of all cases of confirmed diabetes. As diabetes was not ruled out or confirmed in the patients without abnormal blood glucose concentrations, it is unknown how many cases were missed in that population.
The ADA has developed questionnaires13 from which risk of diabetes is calculated as a composite score. McGregor and colleagues14 studied the performance of 1 of these questionnaires in a screening program. This questionnaire combines assessment of risk factors with questions about diabetic symptoms, such as fatigue and thirst, to generate a total score. Questionnaires were mailed to and completed by 349 individuals aged older than 60 years in Everett, Washington. Only those individuals identified as “high risk” by the questionnaire were offered follow-up fasting blood glucose testing. This study also falls short of the validity criteria of the “Users’ Guide to the Medical Literature.” An independent nonblinded comparison was made between at-risk questionnaires and the widely accepted diagnostic standard of fasting blood glucose. The patients were older community residents who would likely be candidates for diabetes screening. The risk assessment instrument is widely available and easy to use. Unfortunately, blood glucose testing was not performed on all subjects regardless of their assessed risk. This makes it impossible to assess the likelihood of diabetes among those patients with negative questionnaire results.
One hundred eighty-one of the 349 completed questionnaires indicated patients as high risk. One hundred ten of these patients underwent fasting plasma glucose (FPG) testing. Eleven (10%) had FPG values that exceeded 6.38 mM (114.8 mg/dL); 7 (6.3%) had FPG levels that exceeded the higher reference standard of 7.77 mM (140 mg/dL).
Burden and Burden15 evaluated the performance of the same ADA questionnaire among 383 self-selected participants at a health fair in England. It is unknown whether the comparison of questionnaires and blood glucose results was done independently or blindly. A random glucose value of greater than 6.5 mM (117 mg/dL) was considered abnormal. Random blood glucose can be used as a diagnostic standard for diabetes only in patients with typical symptoms, such as polyuria and polydipsia.6 The investigators do not specify how many screening participants had symptoms. Burden and Burden, therefore, used a questionable reference standard for comparison. All patients underwent subsequent random blood glucose testing. One hundred fifty-eight of 383 participants who completed questionnaires were identified as high risk. Fifty elevated random blood glucose concentrations were found. Among these patients,23 were indicated as high risk by the questionnaires. The LR+ for this study was 1.15; LR-, 0.92.
Computerized Statistical Models
Three studies used statistical analysis to develop questionnaires to identify those subjects at high risk of diabetes. This method uses data from programs in which all participants undergo screening, and their diabetes status, risk factors, and other demographic variables are recorded. In the NHANES III, for example, known history of diabetes and a large number of demographic variables were recorded for each patient. All participants underwent blood glucose testing, and the proportion of previously undiagnosed diabetes was determined. Using this data, the risk factors for diabetes could be determined, and their relative contribution to the likelihood of a diagnosis of diabetes could be calculated. The resulting risk-factor-based model was tested on the same data to determine how effectively it detected the previously undiagnosed cases of diabetes. The obvious difficulty with this method is that the performance of the risk-factor-based model was not field tested in a population separate from that in which it was developed. Rather than comparing a diagnostic test with an accepted standard, this technique involves developing a diagnostic test and “fitting” it to results already obtained by the application of an accepted standard test.
Herman and coworkers16 used this technique to develop a risk-factor-based questionnaire using data from NHANES II,17 in which 164 people with previously undiagnosed diabetes and 3220 with neither previously known nor newly diagnosed diabetes were identified. Their questionnaire used older age, excess body weight, lower level of physical activity, family history of diabetes, and history of delivery of a macrosomic infant as risks for type 2 diabetes. It was then tested on the same NHANES II data. The comparison with the reference standard was, therefore, neither independent nor blind. A broad spectrum of patients, typical of those who might undergo diagnostic testing for diabetes in clinical practice, was included in NHANES II. As the reference standard was obtained before completion of the questionnaire, the question of whether the results of the test being evaluated influenced the decision to perform the reference standard is not applicable. Administration of the questionnaire is easy and reproducible.
Herman and colleagues’ risk assessment instrument identified 1269 of 3384 patients in the sample as high risk. The questionnaire identified 129 of the 164 persons with diabetes (LR+ was 2.22; LR-, 0.32). In this model, 10% of those identified as high risk would actually have diabetes. Performing blood glucose testing on only those at high risk would miss 21% of the cases of diabetes.
Barriga and coworkers18 used the same technique as Herman and colleagues to develop several risk-factor-based models using data from community-based Hispanic and non-Hispanic white patients in California. Models were designed to help decide which patients should undergo confirmatory blood glucose testing through oral glucose tolerance testing (OGTT) to identify both type 2 diabetes and impaired glucose tolerance. All but 1 of their models used a serum test result as a risk factor, either fasting blood glucose or glycohemoglobin. Using a serum test result as a risk factor defeats the purpose of risk-factor-based screening, in which the goal is to minimize blood glucose tests. Sequential fasting glucose measurements can be used to confirm or rule out diabetes.6
The use of serum test results as a risk factor, measuring impaired glucose tolerance and type 2 diabetes as a combined outcome, and the methodologic shortcomings inherent to computerized statistical model design and testing make the validity of this study questionable and the results difficult to compare and interpret. One of the models described, the first step in a sequential assessment of risk factors, used only body mass index (>27.9) and age (>53.6 years) as risk factors. A high-risk patient had 1 or both of these risk factors. The LR+ for this risk factor model was 1.56; LR-, 0.2.
Azzopardi and coworkers19 developed computerized models on the basis of risk factors and symptoms of diabetes such as lethargy and thirst. Models were designed to collectively identify both type 1 and type 2 diabetes. Their study, therefore, does not precisely match the question addressed in this review. In developing their models, the authors studied patients in Malta: 128 newly diagnosed with diabetes and 320 without known diabetes. This second group was used as a control group. The prevalence of risk factors and symptoms were compared between the 2 groups to generate models to identify high-risk patients. The models were then tested on the same group of control patients and those with diabetes. OGTT was used as the reference standard for diagnosis of diabetes, and was performed only on the control subjects identified as high risk.
This study not only suffers from the shortcomings in validity inherent to this methodology, but the reference standard of OGTT was not applied to all control patients. In a true control group, the absence of diabetes would be confirmed. How many cases of diabetes were missed in this group is unknown; therefore, likelihood ratios cannot be calculated. The best performing model identified 84% of the 128 patients with diabetes as high risk. Sixty-four (20%) of control patients were identified as being at high risk. Diabetes was confirmed in 17. There were, therefore, 47 false positives.
Statistical Model with Prospective Validation
Ruige and colleagues20 developed a questionnaire to identify patients at risk for type 2 diabetes by studying both symptoms and risk factors among 2364 white patients in the Netherlands without known diabetes. OGTT was used to confirm or rule out disease in all patients. The final questionnaire included questions about thirst, shortness of breath, reluctance to use a bicycle, age, obesity, sex, family history of diabetes, and use of antihypertensive drugs as predictors of type 2 diabetes. Age, family history, and obesity were the most significant risks. This questionnaire was then prospectively evaluated in a completely separate but similar second population of 786 patients in whom diabetes was confirmed either through fasting glucose or OGTT. The questionnaire generates a composite score with a cutoff that can be varied.
Ruige and coworkers used the widely accepted tests of fasting glucose and OGTT as diagnostic reference standards. It is unclear whether the comparison of questionnaire results with blood glucose testing was blind or completely independent. The reference standard was applied regardless of the results on the questionnaire. The questionnaire is easy to use and reproduce.
The sample did not include nonwhite patients. Race is a significant risk factor for type 2 diabetes in the United States. Nonwhite patients may receive the greatest benefit from diabetes screening.
Using a cutoff score of 5 on the self-reporting questionnaire, the LR+ was 1.6; LR-, 0.50. In the second population, Ruige and colleagues also tested the questionnaires used by Herman and coworkers16 which yielded an LR+ of 1.60 and an LR- of 0.51, and the ADA questionnaire,13 which yielded an LR+ of 1.37 and an LR- of 0.72.
Discussion
Several aspects of the diagnosis and care of patients with type 2 diabetes remain controversial and are the subjects of intensive research. It is unclear, for example, whether early identification through screening influences the course of diabetes and its complications. Evidence has only recently emerged that suggests that intensive treatment of type 2 diabetes prevents complications.21 These unresolved issues aside, the accurate identification of cases is a significant goal among clinicians.
Though the strategy of identifying and screening only high-risk individuals is widely recommended and practiced, there is little research to prove its effectiveness.
The results presented in the 7 papers do not make a convincing case for using the assessment of risk factors alone or in combination with typical symptoms as a screening tool. The LR+s of the studies fall between 1.05 and 2.22, meaning that screening only high-risk patients may modestly raise the post-test probability of diabetes.22 Similarly, the LR-s of 0.92 to 0.20 indicate that assessing and screening high-risk patients is only slightly helpful in ruling out disease. It seems that screening on the basis of risk factors is not useful. Together with the widely held view that universal screening is inefficient, we are left without an answer to the question of who should undergo blood glucose testing.
Implications for further research
It would be interesting to test Ruige and coworkers’ self-reporting questionnaire in a population that includes large numbers of minority groups at high risk. The diagnostic yield of a risk and symptom assessment instrument in such a population may be higher. Alternatively, a new risk and symptom assessment instrument could be developed and prospectively evaluated in the American population. Ideally, a study using such an instrument would include long-term follow-up of patients, not only to verify the presence or absence of diabetes, but also to see whether use of the instrument has any impact on morbidity and mortality resulting from this disease.
Assessing risk factors and performing blood glucose testing in only high-risk patients to identify type 2 diabetes is not recommended. The best way to identify new cases is unclear at this point. Blood glucose testing can be offered to patients who present with typical symptoms of diabetes.
1. American Diabetes Association. Direct and indirect costs of diabetes in the United States in 1993. Alexandria, Va: American Diabetes Association; 1993.
2. Harris MI, Flegal KM, Cowie CC, et al. Prevalence of diabetes, impaired fasting glucose, and impaired glucose tolerance in U.S. adults: the third national health and nutrition examination survey, 1988-1994. Diabetes Care 1998;21:518-24.
3. Worrall G. Screening healthy people for diabetes: is it worthwhile? J Fam Pract 1991;33:155-60.
4. Newman WP, Nelson R, Scheer K. Community screening for diabetes. Low detection rate in a low-risk population. Diabetes Care 1994;17:363-5.
5. Bourn D, Mann J. Screening for noninsulin dependent diabetes mellitus and impaired glucose tolerance in a Dunedin general practice: is it worth it? NZ Med J 1992;105:207-10.
6. American Diabetes Association. Screening for type 2 diabetes. Diabetes Care 1998;21(suppl 1):S20-2.
7. Paterson R. Population screening for diabetes mellitus. Diabet Med 1993;10:777-81.
8. World Health Organization Study Group on Prevention of Diabetes Mellitus. Prevention of diabetes mellitus. Geneva, Switzerland: World Health Organization; 1994 (Technical Report Series, no. 844).
9. US Preventive Services Task Force. Guide to clinical preventive services. 2nd ed. Baltimore, Md: Williams & Wilkins; 1996.
10. Canadian Task Force on the Periodic Health Examination. The periodic health examination. Can Med Assoc J 1979;121:1193-254.
11. Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? JAMA 1994;271:389-91.
12. Duncan WE, Linville N, Clement S. Assessing risk factors when screening for diabetes mellitus. Diabetes Care 1993;16:1403-4.
13. American Diabetes Association. Are you at risk for diabetes? Diabetes Forecast 1993;46:55.-
14. McGregor MS, Pinkham C, Ahroni JH, Herter CD, Doctor JD. The American Diabetes Association risk test for diabetes: is it a useful screening tool? Diabetes Care 1995;18:585-6.
15. Burden ML, Burden AC. The American Diabetes Association screening questionnaire for diabetes: is it worthwhile in the U.K.? Diabetes Care 1994;17:97-8.
16. Herman WH, Smith PJ, Thompson TJ, Engelgau MM, Aubert RE. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care 1995;18:382-7.
17. National Center for Health Statistics. Plan and operation of the second national health and nutrition examination survey. Vital Health Stat 1 Washington, DC: US Government Printing Office; 1981 (Department of Health and Human Services publication no. 81-1317).
18. Barriga KJ, Hamman RF, Hoag S, Marshall JA, Shetterly SM. Population screening for glucose intolerant subjects using decision tree analysis. Diabetes Res Clin Pract 1996;34(suppl):S17-29.
19. Azzopardi J, Fenech FF, Junoussov Z, Mazovetsky A, Olchanski V. A computerized health screening and follow-up system in diabetes mellitus. Diabet Med 1995;12:271-6.
20. Ruige JB, De Neeling JND, Kostense PJ, Bouter LM, Heine RJ. Performance of an NIDDM screening questionnaire based on symptoms and risk factors. Diabetes Care 1997;20:491-6.
21. UK Prospective Diabetes Study Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 1998;352:837-53.
22. Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994;271:703-7.
SEARCH STRATEGIES: The MEDLINE and EMBASE electronic databases were searched for original studies of screening for type 2 diabetes on the basis of risk factors. The reference lists of all reviews, letters, editorials, consensus statements, and guidelines for diabetes screening were searched for additional studies. The Cochrane database was also searched for relevant reviews.
SELECTION CRITERIA: All original studies regarding selective serum screening for type 2 diabetes on the basis of risk factors were included.
MAIN RESULTS: Seven studies were selected for review. Three studies were cross-sectional in design; 3 employed survey data to develop computerized statistical models that used risk factors to identify cases of type 2 diabetes; and 1 used a similar method, but the resulting model was field tested in a separate population. No study describes a risk-factor-based method or instrument that helps substantially in the diagnosis of type 2 diabetes.
CONCLUSIONS: Selective screening for type 2 diabetes on the basis of risk factors cannot be recommended. Serum screening can be offered to patients who present with typical symptoms of diabetes.
Diabetes is responsible for half of all nontraumatic amputations, 15% of blindness, and more than a third of all end-stage renal disease. The costs attributed to this disease total more than $100 billion annually.1
Identifying new patients with diabetes has always been a challenge. The third National Health and Nutrition Examination Survey (NHANES III)2 revealed that roughly 35% of the people with type 2 diabetes remains undiagnosed. This suggests that universal screening might be prudent. Several studies, however, have shown that the yield from universal serum screening in specific populations is low.3-5 Because of this low yield and because the risk factors for type 2 diabetes are well known Table 1, selective serum screening on the basis of risk factors is widely recommended.6-10
The goal of this paper was to determine the usefulness of the assessment of risk factors as a tool to decide who should undergo serum screening. Different investigators and organizations use slightly different serum tests and definitions to rule out or confirm the presence of type 2 diabetes. This paper does not address the performance of different blood tests (eg, fasting blood glucose, oral glucose tolerance test) or different definitions of type 2 diabetes.
Methods
The MEDLINE and EMBASE electronic databases were searched for the years 1966 to 1998, using the medical subject headings “diabetes mellitus” and “mass screening.” The search was then limited to English language papers dealing with human subjects. Resulting sets were combined. Titles of all papers in the combined set for each database were surveyed. Original studies, regardless of design, that deal specifically with the use of risk factors to screen and identify patients with type 2 diabetes were included. Reference lists of all publications including original studies, letters, commentaries, guidelines, and reviews were surveyed for relevant original studies. The Cochrane Database of Systematic Reviews was searched for relevant reviews.
Selected studies were assessed for validity using the “Users’ Guide to the Medical Literature.”11 The results of all studies were reviewed, regardless of shortcomings in validity Table 2.
Results
A total of 346 citations were retrieved from MEDLINE by combining the sets for “diabetes mellitus” and “mass screening” and limiting the search to English language papers dealing with human subjects. Review of the titles revealed 23 papers that specifically addressed the strategy of serum screening on the basis of risk factors. All 23 were retrieved; 7 were original studies. Searches of the EMBASE database, The Cochrane Database of Systematic Reviews, and reference lists of all papers did not yield additional studies.
Three of the 7 studies were cross-sectional in design. Three other studies used data from health surveys to develop computerized models designed to predict the presence of diabetes on the basis of risk factors. The final study used a similar design, but the resulting model was then tested in a population different from that in which it was developed.
Four of the studies used a combination of risk factors and symptoms of diabetes (eg, polyuria). The use of symptoms in combination with risk factors does not precisely fit the question addressed by this paper. So few studies address the use of risk factors alone, however, that these studies were also included. Table 3 provides a summary of the results.
Cross-Sectional Studies
Duncan, Linville, and Clement12 measured the risk factors and blood glucose levels of 575 self-selected participants in a screening program to test the strategy recommended by the American Diabetes Association (ADA)6 of screening only those patients with 1 or more risk factors for diabetes. It is unknown whether blood glucose levels were measured by independent blind investigators without knowledge of each patient’s risk factors. The authors describe blood glucose test results only as “normal” or “abnormal.” It is not specified whether fasting or random blood glucose levels were used and what level was considered abnormal or diagnostic of diabetes. The demographic characteristics of the patients were not described. Blood glucose testing was done on all subjects, regardless of whether risk factors were present. Only patients with abnormal results were followed up to determine which of them were eventually diagnosed with type 2 diabetes. The validity of this study, therefore, is questionable, as the standards used for abnormal and normal blood glucose levels and the characteristics of the patients studied are unclear.
Among 575 screening participants, 383 had 1 or more risk factors. Fifty-one glucose measurements were considered abnormal: 16 in patients without risk factors and 35 in patients with 1 or more risk factors. The likelihood ratio (LR) for a positive or “at risk” questionnaire (LR+) was 1.05; for a negative questionnaire (LR-), it was 0.93. Follow-up of only those patients with abnormal blood glucose results over 1.8 years revealed that 21 (41%) had confirmed diabetes: 15 with one or more risk factors and 6 with none. Performing blood glucose testing in only patients with 1 or more risk factors would have missed at least 6 of 21 or 29% of all cases of confirmed diabetes. As diabetes was not ruled out or confirmed in the patients without abnormal blood glucose concentrations, it is unknown how many cases were missed in that population.
The ADA has developed questionnaires13 from which risk of diabetes is calculated as a composite score. McGregor and colleagues14 studied the performance of 1 of these questionnaires in a screening program. This questionnaire combines assessment of risk factors with questions about diabetic symptoms, such as fatigue and thirst, to generate a total score. Questionnaires were mailed to and completed by 349 individuals aged older than 60 years in Everett, Washington. Only those individuals identified as “high risk” by the questionnaire were offered follow-up fasting blood glucose testing. This study also falls short of the validity criteria of the “Users’ Guide to the Medical Literature.” An independent nonblinded comparison was made between at-risk questionnaires and the widely accepted diagnostic standard of fasting blood glucose. The patients were older community residents who would likely be candidates for diabetes screening. The risk assessment instrument is widely available and easy to use. Unfortunately, blood glucose testing was not performed on all subjects regardless of their assessed risk. This makes it impossible to assess the likelihood of diabetes among those patients with negative questionnaire results.
One hundred eighty-one of the 349 completed questionnaires indicated patients as high risk. One hundred ten of these patients underwent fasting plasma glucose (FPG) testing. Eleven (10%) had FPG values that exceeded 6.38 mM (114.8 mg/dL); 7 (6.3%) had FPG levels that exceeded the higher reference standard of 7.77 mM (140 mg/dL).
Burden and Burden15 evaluated the performance of the same ADA questionnaire among 383 self-selected participants at a health fair in England. It is unknown whether the comparison of questionnaires and blood glucose results was done independently or blindly. A random glucose value of greater than 6.5 mM (117 mg/dL) was considered abnormal. Random blood glucose can be used as a diagnostic standard for diabetes only in patients with typical symptoms, such as polyuria and polydipsia.6 The investigators do not specify how many screening participants had symptoms. Burden and Burden, therefore, used a questionable reference standard for comparison. All patients underwent subsequent random blood glucose testing. One hundred fifty-eight of 383 participants who completed questionnaires were identified as high risk. Fifty elevated random blood glucose concentrations were found. Among these patients,23 were indicated as high risk by the questionnaires. The LR+ for this study was 1.15; LR-, 0.92.
Computerized Statistical Models
Three studies used statistical analysis to develop questionnaires to identify those subjects at high risk of diabetes. This method uses data from programs in which all participants undergo screening, and their diabetes status, risk factors, and other demographic variables are recorded. In the NHANES III, for example, known history of diabetes and a large number of demographic variables were recorded for each patient. All participants underwent blood glucose testing, and the proportion of previously undiagnosed diabetes was determined. Using this data, the risk factors for diabetes could be determined, and their relative contribution to the likelihood of a diagnosis of diabetes could be calculated. The resulting risk-factor-based model was tested on the same data to determine how effectively it detected the previously undiagnosed cases of diabetes. The obvious difficulty with this method is that the performance of the risk-factor-based model was not field tested in a population separate from that in which it was developed. Rather than comparing a diagnostic test with an accepted standard, this technique involves developing a diagnostic test and “fitting” it to results already obtained by the application of an accepted standard test.
Herman and coworkers16 used this technique to develop a risk-factor-based questionnaire using data from NHANES II,17 in which 164 people with previously undiagnosed diabetes and 3220 with neither previously known nor newly diagnosed diabetes were identified. Their questionnaire used older age, excess body weight, lower level of physical activity, family history of diabetes, and history of delivery of a macrosomic infant as risks for type 2 diabetes. It was then tested on the same NHANES II data. The comparison with the reference standard was, therefore, neither independent nor blind. A broad spectrum of patients, typical of those who might undergo diagnostic testing for diabetes in clinical practice, was included in NHANES II. As the reference standard was obtained before completion of the questionnaire, the question of whether the results of the test being evaluated influenced the decision to perform the reference standard is not applicable. Administration of the questionnaire is easy and reproducible.
Herman and colleagues’ risk assessment instrument identified 1269 of 3384 patients in the sample as high risk. The questionnaire identified 129 of the 164 persons with diabetes (LR+ was 2.22; LR-, 0.32). In this model, 10% of those identified as high risk would actually have diabetes. Performing blood glucose testing on only those at high risk would miss 21% of the cases of diabetes.
Barriga and coworkers18 used the same technique as Herman and colleagues to develop several risk-factor-based models using data from community-based Hispanic and non-Hispanic white patients in California. Models were designed to help decide which patients should undergo confirmatory blood glucose testing through oral glucose tolerance testing (OGTT) to identify both type 2 diabetes and impaired glucose tolerance. All but 1 of their models used a serum test result as a risk factor, either fasting blood glucose or glycohemoglobin. Using a serum test result as a risk factor defeats the purpose of risk-factor-based screening, in which the goal is to minimize blood glucose tests. Sequential fasting glucose measurements can be used to confirm or rule out diabetes.6
The use of serum test results as a risk factor, measuring impaired glucose tolerance and type 2 diabetes as a combined outcome, and the methodologic shortcomings inherent to computerized statistical model design and testing make the validity of this study questionable and the results difficult to compare and interpret. One of the models described, the first step in a sequential assessment of risk factors, used only body mass index (>27.9) and age (>53.6 years) as risk factors. A high-risk patient had 1 or both of these risk factors. The LR+ for this risk factor model was 1.56; LR-, 0.2.
Azzopardi and coworkers19 developed computerized models on the basis of risk factors and symptoms of diabetes such as lethargy and thirst. Models were designed to collectively identify both type 1 and type 2 diabetes. Their study, therefore, does not precisely match the question addressed in this review. In developing their models, the authors studied patients in Malta: 128 newly diagnosed with diabetes and 320 without known diabetes. This second group was used as a control group. The prevalence of risk factors and symptoms were compared between the 2 groups to generate models to identify high-risk patients. The models were then tested on the same group of control patients and those with diabetes. OGTT was used as the reference standard for diagnosis of diabetes, and was performed only on the control subjects identified as high risk.
This study not only suffers from the shortcomings in validity inherent to this methodology, but the reference standard of OGTT was not applied to all control patients. In a true control group, the absence of diabetes would be confirmed. How many cases of diabetes were missed in this group is unknown; therefore, likelihood ratios cannot be calculated. The best performing model identified 84% of the 128 patients with diabetes as high risk. Sixty-four (20%) of control patients were identified as being at high risk. Diabetes was confirmed in 17. There were, therefore, 47 false positives.
Statistical Model with Prospective Validation
Ruige and colleagues20 developed a questionnaire to identify patients at risk for type 2 diabetes by studying both symptoms and risk factors among 2364 white patients in the Netherlands without known diabetes. OGTT was used to confirm or rule out disease in all patients. The final questionnaire included questions about thirst, shortness of breath, reluctance to use a bicycle, age, obesity, sex, family history of diabetes, and use of antihypertensive drugs as predictors of type 2 diabetes. Age, family history, and obesity were the most significant risks. This questionnaire was then prospectively evaluated in a completely separate but similar second population of 786 patients in whom diabetes was confirmed either through fasting glucose or OGTT. The questionnaire generates a composite score with a cutoff that can be varied.
Ruige and coworkers used the widely accepted tests of fasting glucose and OGTT as diagnostic reference standards. It is unclear whether the comparison of questionnaire results with blood glucose testing was blind or completely independent. The reference standard was applied regardless of the results on the questionnaire. The questionnaire is easy to use and reproduce.
The sample did not include nonwhite patients. Race is a significant risk factor for type 2 diabetes in the United States. Nonwhite patients may receive the greatest benefit from diabetes screening.
Using a cutoff score of 5 on the self-reporting questionnaire, the LR+ was 1.6; LR-, 0.50. In the second population, Ruige and colleagues also tested the questionnaires used by Herman and coworkers16 which yielded an LR+ of 1.60 and an LR- of 0.51, and the ADA questionnaire,13 which yielded an LR+ of 1.37 and an LR- of 0.72.
Discussion
Several aspects of the diagnosis and care of patients with type 2 diabetes remain controversial and are the subjects of intensive research. It is unclear, for example, whether early identification through screening influences the course of diabetes and its complications. Evidence has only recently emerged that suggests that intensive treatment of type 2 diabetes prevents complications.21 These unresolved issues aside, the accurate identification of cases is a significant goal among clinicians.
Though the strategy of identifying and screening only high-risk individuals is widely recommended and practiced, there is little research to prove its effectiveness.
The results presented in the 7 papers do not make a convincing case for using the assessment of risk factors alone or in combination with typical symptoms as a screening tool. The LR+s of the studies fall between 1.05 and 2.22, meaning that screening only high-risk patients may modestly raise the post-test probability of diabetes.22 Similarly, the LR-s of 0.92 to 0.20 indicate that assessing and screening high-risk patients is only slightly helpful in ruling out disease. It seems that screening on the basis of risk factors is not useful. Together with the widely held view that universal screening is inefficient, we are left without an answer to the question of who should undergo blood glucose testing.
Implications for further research
It would be interesting to test Ruige and coworkers’ self-reporting questionnaire in a population that includes large numbers of minority groups at high risk. The diagnostic yield of a risk and symptom assessment instrument in such a population may be higher. Alternatively, a new risk and symptom assessment instrument could be developed and prospectively evaluated in the American population. Ideally, a study using such an instrument would include long-term follow-up of patients, not only to verify the presence or absence of diabetes, but also to see whether use of the instrument has any impact on morbidity and mortality resulting from this disease.
Assessing risk factors and performing blood glucose testing in only high-risk patients to identify type 2 diabetes is not recommended. The best way to identify new cases is unclear at this point. Blood glucose testing can be offered to patients who present with typical symptoms of diabetes.
SEARCH STRATEGIES: The MEDLINE and EMBASE electronic databases were searched for original studies of screening for type 2 diabetes on the basis of risk factors. The reference lists of all reviews, letters, editorials, consensus statements, and guidelines for diabetes screening were searched for additional studies. The Cochrane database was also searched for relevant reviews.
SELECTION CRITERIA: All original studies regarding selective serum screening for type 2 diabetes on the basis of risk factors were included.
MAIN RESULTS: Seven studies were selected for review. Three studies were cross-sectional in design; 3 employed survey data to develop computerized statistical models that used risk factors to identify cases of type 2 diabetes; and 1 used a similar method, but the resulting model was field tested in a separate population. No study describes a risk-factor-based method or instrument that helps substantially in the diagnosis of type 2 diabetes.
CONCLUSIONS: Selective screening for type 2 diabetes on the basis of risk factors cannot be recommended. Serum screening can be offered to patients who present with typical symptoms of diabetes.
Diabetes is responsible for half of all nontraumatic amputations, 15% of blindness, and more than a third of all end-stage renal disease. The costs attributed to this disease total more than $100 billion annually.1
Identifying new patients with diabetes has always been a challenge. The third National Health and Nutrition Examination Survey (NHANES III)2 revealed that roughly 35% of the people with type 2 diabetes remains undiagnosed. This suggests that universal screening might be prudent. Several studies, however, have shown that the yield from universal serum screening in specific populations is low.3-5 Because of this low yield and because the risk factors for type 2 diabetes are well known Table 1, selective serum screening on the basis of risk factors is widely recommended.6-10
The goal of this paper was to determine the usefulness of the assessment of risk factors as a tool to decide who should undergo serum screening. Different investigators and organizations use slightly different serum tests and definitions to rule out or confirm the presence of type 2 diabetes. This paper does not address the performance of different blood tests (eg, fasting blood glucose, oral glucose tolerance test) or different definitions of type 2 diabetes.
Methods
The MEDLINE and EMBASE electronic databases were searched for the years 1966 to 1998, using the medical subject headings “diabetes mellitus” and “mass screening.” The search was then limited to English language papers dealing with human subjects. Resulting sets were combined. Titles of all papers in the combined set for each database were surveyed. Original studies, regardless of design, that deal specifically with the use of risk factors to screen and identify patients with type 2 diabetes were included. Reference lists of all publications including original studies, letters, commentaries, guidelines, and reviews were surveyed for relevant original studies. The Cochrane Database of Systematic Reviews was searched for relevant reviews.
Selected studies were assessed for validity using the “Users’ Guide to the Medical Literature.”11 The results of all studies were reviewed, regardless of shortcomings in validity Table 2.
Results
A total of 346 citations were retrieved from MEDLINE by combining the sets for “diabetes mellitus” and “mass screening” and limiting the search to English language papers dealing with human subjects. Review of the titles revealed 23 papers that specifically addressed the strategy of serum screening on the basis of risk factors. All 23 were retrieved; 7 were original studies. Searches of the EMBASE database, The Cochrane Database of Systematic Reviews, and reference lists of all papers did not yield additional studies.
Three of the 7 studies were cross-sectional in design. Three other studies used data from health surveys to develop computerized models designed to predict the presence of diabetes on the basis of risk factors. The final study used a similar design, but the resulting model was then tested in a population different from that in which it was developed.
Four of the studies used a combination of risk factors and symptoms of diabetes (eg, polyuria). The use of symptoms in combination with risk factors does not precisely fit the question addressed by this paper. So few studies address the use of risk factors alone, however, that these studies were also included. Table 3 provides a summary of the results.
Cross-Sectional Studies
Duncan, Linville, and Clement12 measured the risk factors and blood glucose levels of 575 self-selected participants in a screening program to test the strategy recommended by the American Diabetes Association (ADA)6 of screening only those patients with 1 or more risk factors for diabetes. It is unknown whether blood glucose levels were measured by independent blind investigators without knowledge of each patient’s risk factors. The authors describe blood glucose test results only as “normal” or “abnormal.” It is not specified whether fasting or random blood glucose levels were used and what level was considered abnormal or diagnostic of diabetes. The demographic characteristics of the patients were not described. Blood glucose testing was done on all subjects, regardless of whether risk factors were present. Only patients with abnormal results were followed up to determine which of them were eventually diagnosed with type 2 diabetes. The validity of this study, therefore, is questionable, as the standards used for abnormal and normal blood glucose levels and the characteristics of the patients studied are unclear.
Among 575 screening participants, 383 had 1 or more risk factors. Fifty-one glucose measurements were considered abnormal: 16 in patients without risk factors and 35 in patients with 1 or more risk factors. The likelihood ratio (LR) for a positive or “at risk” questionnaire (LR+) was 1.05; for a negative questionnaire (LR-), it was 0.93. Follow-up of only those patients with abnormal blood glucose results over 1.8 years revealed that 21 (41%) had confirmed diabetes: 15 with one or more risk factors and 6 with none. Performing blood glucose testing in only patients with 1 or more risk factors would have missed at least 6 of 21 or 29% of all cases of confirmed diabetes. As diabetes was not ruled out or confirmed in the patients without abnormal blood glucose concentrations, it is unknown how many cases were missed in that population.
The ADA has developed questionnaires13 from which risk of diabetes is calculated as a composite score. McGregor and colleagues14 studied the performance of 1 of these questionnaires in a screening program. This questionnaire combines assessment of risk factors with questions about diabetic symptoms, such as fatigue and thirst, to generate a total score. Questionnaires were mailed to and completed by 349 individuals aged older than 60 years in Everett, Washington. Only those individuals identified as “high risk” by the questionnaire were offered follow-up fasting blood glucose testing. This study also falls short of the validity criteria of the “Users’ Guide to the Medical Literature.” An independent nonblinded comparison was made between at-risk questionnaires and the widely accepted diagnostic standard of fasting blood glucose. The patients were older community residents who would likely be candidates for diabetes screening. The risk assessment instrument is widely available and easy to use. Unfortunately, blood glucose testing was not performed on all subjects regardless of their assessed risk. This makes it impossible to assess the likelihood of diabetes among those patients with negative questionnaire results.
One hundred eighty-one of the 349 completed questionnaires indicated patients as high risk. One hundred ten of these patients underwent fasting plasma glucose (FPG) testing. Eleven (10%) had FPG values that exceeded 6.38 mM (114.8 mg/dL); 7 (6.3%) had FPG levels that exceeded the higher reference standard of 7.77 mM (140 mg/dL).
Burden and Burden15 evaluated the performance of the same ADA questionnaire among 383 self-selected participants at a health fair in England. It is unknown whether the comparison of questionnaires and blood glucose results was done independently or blindly. A random glucose value of greater than 6.5 mM (117 mg/dL) was considered abnormal. Random blood glucose can be used as a diagnostic standard for diabetes only in patients with typical symptoms, such as polyuria and polydipsia.6 The investigators do not specify how many screening participants had symptoms. Burden and Burden, therefore, used a questionable reference standard for comparison. All patients underwent subsequent random blood glucose testing. One hundred fifty-eight of 383 participants who completed questionnaires were identified as high risk. Fifty elevated random blood glucose concentrations were found. Among these patients,23 were indicated as high risk by the questionnaires. The LR+ for this study was 1.15; LR-, 0.92.
Computerized Statistical Models
Three studies used statistical analysis to develop questionnaires to identify those subjects at high risk of diabetes. This method uses data from programs in which all participants undergo screening, and their diabetes status, risk factors, and other demographic variables are recorded. In the NHANES III, for example, known history of diabetes and a large number of demographic variables were recorded for each patient. All participants underwent blood glucose testing, and the proportion of previously undiagnosed diabetes was determined. Using this data, the risk factors for diabetes could be determined, and their relative contribution to the likelihood of a diagnosis of diabetes could be calculated. The resulting risk-factor-based model was tested on the same data to determine how effectively it detected the previously undiagnosed cases of diabetes. The obvious difficulty with this method is that the performance of the risk-factor-based model was not field tested in a population separate from that in which it was developed. Rather than comparing a diagnostic test with an accepted standard, this technique involves developing a diagnostic test and “fitting” it to results already obtained by the application of an accepted standard test.
Herman and coworkers16 used this technique to develop a risk-factor-based questionnaire using data from NHANES II,17 in which 164 people with previously undiagnosed diabetes and 3220 with neither previously known nor newly diagnosed diabetes were identified. Their questionnaire used older age, excess body weight, lower level of physical activity, family history of diabetes, and history of delivery of a macrosomic infant as risks for type 2 diabetes. It was then tested on the same NHANES II data. The comparison with the reference standard was, therefore, neither independent nor blind. A broad spectrum of patients, typical of those who might undergo diagnostic testing for diabetes in clinical practice, was included in NHANES II. As the reference standard was obtained before completion of the questionnaire, the question of whether the results of the test being evaluated influenced the decision to perform the reference standard is not applicable. Administration of the questionnaire is easy and reproducible.
Herman and colleagues’ risk assessment instrument identified 1269 of 3384 patients in the sample as high risk. The questionnaire identified 129 of the 164 persons with diabetes (LR+ was 2.22; LR-, 0.32). In this model, 10% of those identified as high risk would actually have diabetes. Performing blood glucose testing on only those at high risk would miss 21% of the cases of diabetes.
Barriga and coworkers18 used the same technique as Herman and colleagues to develop several risk-factor-based models using data from community-based Hispanic and non-Hispanic white patients in California. Models were designed to help decide which patients should undergo confirmatory blood glucose testing through oral glucose tolerance testing (OGTT) to identify both type 2 diabetes and impaired glucose tolerance. All but 1 of their models used a serum test result as a risk factor, either fasting blood glucose or glycohemoglobin. Using a serum test result as a risk factor defeats the purpose of risk-factor-based screening, in which the goal is to minimize blood glucose tests. Sequential fasting glucose measurements can be used to confirm or rule out diabetes.6
The use of serum test results as a risk factor, measuring impaired glucose tolerance and type 2 diabetes as a combined outcome, and the methodologic shortcomings inherent to computerized statistical model design and testing make the validity of this study questionable and the results difficult to compare and interpret. One of the models described, the first step in a sequential assessment of risk factors, used only body mass index (>27.9) and age (>53.6 years) as risk factors. A high-risk patient had 1 or both of these risk factors. The LR+ for this risk factor model was 1.56; LR-, 0.2.
Azzopardi and coworkers19 developed computerized models on the basis of risk factors and symptoms of diabetes such as lethargy and thirst. Models were designed to collectively identify both type 1 and type 2 diabetes. Their study, therefore, does not precisely match the question addressed in this review. In developing their models, the authors studied patients in Malta: 128 newly diagnosed with diabetes and 320 without known diabetes. This second group was used as a control group. The prevalence of risk factors and symptoms were compared between the 2 groups to generate models to identify high-risk patients. The models were then tested on the same group of control patients and those with diabetes. OGTT was used as the reference standard for diagnosis of diabetes, and was performed only on the control subjects identified as high risk.
This study not only suffers from the shortcomings in validity inherent to this methodology, but the reference standard of OGTT was not applied to all control patients. In a true control group, the absence of diabetes would be confirmed. How many cases of diabetes were missed in this group is unknown; therefore, likelihood ratios cannot be calculated. The best performing model identified 84% of the 128 patients with diabetes as high risk. Sixty-four (20%) of control patients were identified as being at high risk. Diabetes was confirmed in 17. There were, therefore, 47 false positives.
Statistical Model with Prospective Validation
Ruige and colleagues20 developed a questionnaire to identify patients at risk for type 2 diabetes by studying both symptoms and risk factors among 2364 white patients in the Netherlands without known diabetes. OGTT was used to confirm or rule out disease in all patients. The final questionnaire included questions about thirst, shortness of breath, reluctance to use a bicycle, age, obesity, sex, family history of diabetes, and use of antihypertensive drugs as predictors of type 2 diabetes. Age, family history, and obesity were the most significant risks. This questionnaire was then prospectively evaluated in a completely separate but similar second population of 786 patients in whom diabetes was confirmed either through fasting glucose or OGTT. The questionnaire generates a composite score with a cutoff that can be varied.
Ruige and coworkers used the widely accepted tests of fasting glucose and OGTT as diagnostic reference standards. It is unclear whether the comparison of questionnaire results with blood glucose testing was blind or completely independent. The reference standard was applied regardless of the results on the questionnaire. The questionnaire is easy to use and reproduce.
The sample did not include nonwhite patients. Race is a significant risk factor for type 2 diabetes in the United States. Nonwhite patients may receive the greatest benefit from diabetes screening.
Using a cutoff score of 5 on the self-reporting questionnaire, the LR+ was 1.6; LR-, 0.50. In the second population, Ruige and colleagues also tested the questionnaires used by Herman and coworkers16 which yielded an LR+ of 1.60 and an LR- of 0.51, and the ADA questionnaire,13 which yielded an LR+ of 1.37 and an LR- of 0.72.
Discussion
Several aspects of the diagnosis and care of patients with type 2 diabetes remain controversial and are the subjects of intensive research. It is unclear, for example, whether early identification through screening influences the course of diabetes and its complications. Evidence has only recently emerged that suggests that intensive treatment of type 2 diabetes prevents complications.21 These unresolved issues aside, the accurate identification of cases is a significant goal among clinicians.
Though the strategy of identifying and screening only high-risk individuals is widely recommended and practiced, there is little research to prove its effectiveness.
The results presented in the 7 papers do not make a convincing case for using the assessment of risk factors alone or in combination with typical symptoms as a screening tool. The LR+s of the studies fall between 1.05 and 2.22, meaning that screening only high-risk patients may modestly raise the post-test probability of diabetes.22 Similarly, the LR-s of 0.92 to 0.20 indicate that assessing and screening high-risk patients is only slightly helpful in ruling out disease. It seems that screening on the basis of risk factors is not useful. Together with the widely held view that universal screening is inefficient, we are left without an answer to the question of who should undergo blood glucose testing.
Implications for further research
It would be interesting to test Ruige and coworkers’ self-reporting questionnaire in a population that includes large numbers of minority groups at high risk. The diagnostic yield of a risk and symptom assessment instrument in such a population may be higher. Alternatively, a new risk and symptom assessment instrument could be developed and prospectively evaluated in the American population. Ideally, a study using such an instrument would include long-term follow-up of patients, not only to verify the presence or absence of diabetes, but also to see whether use of the instrument has any impact on morbidity and mortality resulting from this disease.
Assessing risk factors and performing blood glucose testing in only high-risk patients to identify type 2 diabetes is not recommended. The best way to identify new cases is unclear at this point. Blood glucose testing can be offered to patients who present with typical symptoms of diabetes.
1. American Diabetes Association. Direct and indirect costs of diabetes in the United States in 1993. Alexandria, Va: American Diabetes Association; 1993.
2. Harris MI, Flegal KM, Cowie CC, et al. Prevalence of diabetes, impaired fasting glucose, and impaired glucose tolerance in U.S. adults: the third national health and nutrition examination survey, 1988-1994. Diabetes Care 1998;21:518-24.
3. Worrall G. Screening healthy people for diabetes: is it worthwhile? J Fam Pract 1991;33:155-60.
4. Newman WP, Nelson R, Scheer K. Community screening for diabetes. Low detection rate in a low-risk population. Diabetes Care 1994;17:363-5.
5. Bourn D, Mann J. Screening for noninsulin dependent diabetes mellitus and impaired glucose tolerance in a Dunedin general practice: is it worth it? NZ Med J 1992;105:207-10.
6. American Diabetes Association. Screening for type 2 diabetes. Diabetes Care 1998;21(suppl 1):S20-2.
7. Paterson R. Population screening for diabetes mellitus. Diabet Med 1993;10:777-81.
8. World Health Organization Study Group on Prevention of Diabetes Mellitus. Prevention of diabetes mellitus. Geneva, Switzerland: World Health Organization; 1994 (Technical Report Series, no. 844).
9. US Preventive Services Task Force. Guide to clinical preventive services. 2nd ed. Baltimore, Md: Williams & Wilkins; 1996.
10. Canadian Task Force on the Periodic Health Examination. The periodic health examination. Can Med Assoc J 1979;121:1193-254.
11. Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? JAMA 1994;271:389-91.
12. Duncan WE, Linville N, Clement S. Assessing risk factors when screening for diabetes mellitus. Diabetes Care 1993;16:1403-4.
13. American Diabetes Association. Are you at risk for diabetes? Diabetes Forecast 1993;46:55.-
14. McGregor MS, Pinkham C, Ahroni JH, Herter CD, Doctor JD. The American Diabetes Association risk test for diabetes: is it a useful screening tool? Diabetes Care 1995;18:585-6.
15. Burden ML, Burden AC. The American Diabetes Association screening questionnaire for diabetes: is it worthwhile in the U.K.? Diabetes Care 1994;17:97-8.
16. Herman WH, Smith PJ, Thompson TJ, Engelgau MM, Aubert RE. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care 1995;18:382-7.
17. National Center for Health Statistics. Plan and operation of the second national health and nutrition examination survey. Vital Health Stat 1 Washington, DC: US Government Printing Office; 1981 (Department of Health and Human Services publication no. 81-1317).
18. Barriga KJ, Hamman RF, Hoag S, Marshall JA, Shetterly SM. Population screening for glucose intolerant subjects using decision tree analysis. Diabetes Res Clin Pract 1996;34(suppl):S17-29.
19. Azzopardi J, Fenech FF, Junoussov Z, Mazovetsky A, Olchanski V. A computerized health screening and follow-up system in diabetes mellitus. Diabet Med 1995;12:271-6.
20. Ruige JB, De Neeling JND, Kostense PJ, Bouter LM, Heine RJ. Performance of an NIDDM screening questionnaire based on symptoms and risk factors. Diabetes Care 1997;20:491-6.
21. UK Prospective Diabetes Study Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 1998;352:837-53.
22. Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994;271:703-7.
1. American Diabetes Association. Direct and indirect costs of diabetes in the United States in 1993. Alexandria, Va: American Diabetes Association; 1993.
2. Harris MI, Flegal KM, Cowie CC, et al. Prevalence of diabetes, impaired fasting glucose, and impaired glucose tolerance in U.S. adults: the third national health and nutrition examination survey, 1988-1994. Diabetes Care 1998;21:518-24.
3. Worrall G. Screening healthy people for diabetes: is it worthwhile? J Fam Pract 1991;33:155-60.
4. Newman WP, Nelson R, Scheer K. Community screening for diabetes. Low detection rate in a low-risk population. Diabetes Care 1994;17:363-5.
5. Bourn D, Mann J. Screening for noninsulin dependent diabetes mellitus and impaired glucose tolerance in a Dunedin general practice: is it worth it? NZ Med J 1992;105:207-10.
6. American Diabetes Association. Screening for type 2 diabetes. Diabetes Care 1998;21(suppl 1):S20-2.
7. Paterson R. Population screening for diabetes mellitus. Diabet Med 1993;10:777-81.
8. World Health Organization Study Group on Prevention of Diabetes Mellitus. Prevention of diabetes mellitus. Geneva, Switzerland: World Health Organization; 1994 (Technical Report Series, no. 844).
9. US Preventive Services Task Force. Guide to clinical preventive services. 2nd ed. Baltimore, Md: Williams & Wilkins; 1996.
10. Canadian Task Force on the Periodic Health Examination. The periodic health examination. Can Med Assoc J 1979;121:1193-254.
11. Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? JAMA 1994;271:389-91.
12. Duncan WE, Linville N, Clement S. Assessing risk factors when screening for diabetes mellitus. Diabetes Care 1993;16:1403-4.
13. American Diabetes Association. Are you at risk for diabetes? Diabetes Forecast 1993;46:55.-
14. McGregor MS, Pinkham C, Ahroni JH, Herter CD, Doctor JD. The American Diabetes Association risk test for diabetes: is it a useful screening tool? Diabetes Care 1995;18:585-6.
15. Burden ML, Burden AC. The American Diabetes Association screening questionnaire for diabetes: is it worthwhile in the U.K.? Diabetes Care 1994;17:97-8.
16. Herman WH, Smith PJ, Thompson TJ, Engelgau MM, Aubert RE. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care 1995;18:382-7.
17. National Center for Health Statistics. Plan and operation of the second national health and nutrition examination survey. Vital Health Stat 1 Washington, DC: US Government Printing Office; 1981 (Department of Health and Human Services publication no. 81-1317).
18. Barriga KJ, Hamman RF, Hoag S, Marshall JA, Shetterly SM. Population screening for glucose intolerant subjects using decision tree analysis. Diabetes Res Clin Pract 1996;34(suppl):S17-29.
19. Azzopardi J, Fenech FF, Junoussov Z, Mazovetsky A, Olchanski V. A computerized health screening and follow-up system in diabetes mellitus. Diabet Med 1995;12:271-6.
20. Ruige JB, De Neeling JND, Kostense PJ, Bouter LM, Heine RJ. Performance of an NIDDM screening questionnaire based on symptoms and risk factors. Diabetes Care 1997;20:491-6.
21. UK Prospective Diabetes Study Group. Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 1998;352:837-53.
22. Jaeschke R, Guyatt G, Sackett DL. Users’ guide to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994;271:703-7.