Main

All providers treat patients with genetic conditions. In a recent survey, genetic medicine consumers rated the genetic knowledge of their nongenetics providers as “poor” 31% of the time.1 They perceived problems in coordination of care and in awareness of the medical and psychosocial impacts of genetic disorders. Despite this, a majority identified a nongeneticist as the most important provider in overall management of the patient's condition.

Expectations to integrate genetics into daily patient care are growing, and often exceed providers' training, knowledge, and confidence.27 Primary care providers need to be able to recognize when a genetic issue or diagnosis may exist, answer some initial questions for the patient, and make appropriate specialty referral(s). Frequently, an existing genetic condition must be considered when making recommendations about other medical problems. To manage these responsibilities, clinicians need information that is accurate, accessible, and applicable to a specific clinical problem.712

Time is scarce in healthcare.12,13 Primary care physicians spend less than 2 minutes searching for the answer to a medical question.14,15 To be useful in the busy clinical setting, a point-of-care resource must answer questions quickly.

In a 1999 study of primary care physicians, 2% sought answers to clinical questions in electronic resources.14 By 2005, this proportion had grown to 16%.12

Web resources with genetics content include open access and subscription databases. The open access databases are written by and for geneticists. However, nongeneticists, with limited genetics knowledge, may find these resources difficult to understand and use. The subscription databases are more general and are intended for point-of-care use by a broad range of clinicians.

We examined the accuracy and completeness of selected World Wide Web resources in answering 20 clinical questions about genetic conditions. We also evaluated the time required per answer found.

MATERIALS AND METHODS

For five common genetic conditions, we wrote one question each about signs and symptoms, diagnosis, management, and inheritance. The intent was to highlight issues that should be considered or may become relevant in a nongenetics clinical setting. These questions were pilot-tested for clarity in five online databases. After the pilot, we refined the questions and the scoring categories to their final form.

We searched the primary literature to determine the correct answers to all 20 questions. All information found in at least three independent references (with no contradictory references) was deemed required for a “complete” answer. Information found in fewer than three references, or where references disagreed, was deemed acceptable, meaning it was neither required for a complete answer nor considered “wrong.” Conditions, questions, required answers, and acceptable answers are shown in Table 1.

Table 1 Conditions, questions, and answers for the study

We developed a scoring system to evaluate each resource's answer to each question. To be complete, an answer had to contain all of the required information. There was no penalty for containing or lacking any of the information deemed acceptable. We assigned a score of “partial” when some of the required information was missing, and “not found” when none of the required information was present. Answers that required the user to have any basic genetics knowledge were scored as “vague,” even if they were otherwise complete. Other criteria for vague and criteria for scores of “inconsistent” and wrong are described in Table 2.

Table 2 Criteria for scoring databases

Two genetics databases and seven general databases were selected for review (Table 3). Five-Minute Clinical Consult is available within InfoRetriever, but was excluded from the specific analysis of InfoRetriever. During the last 2 weeks of January 2007 each author independently reviewed six of the nine databases. Two authors examined each database; each pair of authors shared three databases.

Table 3 Databases in this study

We compared the two assigned scores for each question. If both reviewers scored the answer as complete or not found, that was the final score. If matching scores of partial were assigned, we merged the detailed notes taken by each investigator. If the combination included all required content, a final score of complete was assigned. If not, the final score remained partial. The third investigator reviewed all scores of vague, inconsistent, or wrong, and all disagreements between the two primary reviewers. The third investigator reviewed the detailed notes of the primary reviewers and the database itself, assigning the most appropriate final score.

Each reviewer recorded total time to answer all 20 questions for each database. We calculated the efficiency of each resource in two different ways. First, the time required by the fastest searcher was divided by the total number of answers found, regardless of accuracy or completeness of the answers. We then performed a more rigorous calculation of efficiency for finding correct answers. This was determined as the time required by the fastest searcher divided by the total number of complete answers for each database.

Statistical analysis was conducted using Stata 9.2 (StataCorp) and StatXact 7.0.0 (Cytel Software Corporation) software. Inter-reviewer agreement was measured using the kappa statistic (κ). McNemar's test was used to test scores of complete versus all other scores in comparisons between two databases, question types, or disorders. Cochran's Q test was used to test scores of complete versus all other results when comparing more than two databases, question types, or disorders.

RESULTS

Internal consistency

For two databases (DynaMed and Online Mendelian Inheritance In Man [OMIM]), there was poor agreement between the initial reviewers (45% and 50%, respectively). The other seven databases had 65–80% agreement between the initial scores. Overall agreement between reviewers was good (κ = 0.57). The most common disagreements were complete versus vague (26% of disagreements) and vague versus not found (19% of disagreements). In the 28 disagreements over whether or not the information found was complete, 18 (64%) were assigned a final score of complete. Overall, the arbitrator chose a higher score 23 times and a lower score 34 times.

Accuracy of information

Of the 180 answers (20 questions in each of nine databases), 60 (33.3%) were considered complete and 61 (33.9%) were not found. Partial and vague final scores were common (22 and 27 times, respectively). Final scores of inconsistent were assigned twice and wrong eight times (Fig. 1).

Fig 1
figure 1

Final categorization of database answers.

GeneReviews was the most accurate resource containing 70% complete answers and 90% either complete or partial. It had significantly more complete answers than the four worst performing databases (Table 4). There was no significant difference compared with the two next-best performers. Fifty-five percent of the content in UpToDate was scored as complete, but no other resource contained more than 50% complete responses. Importantly, GeneReviews, OMIM, UpToDate, and Physician's Information and Education Resource (PIER) contained no wrong answers (Fig. 2).

Table 4 Comparison of GeneReviews to other databases for number of complete answers
Fig 2
figure 2

Answers by database.

There were no significant differences in the number of complete answers among the five conditions (Cochran's Q test; P = 0.62). Across the databases, cystic fibrosis had the most complete answers (15 of 36; 41.7%) (Fig. 3). Hemochromatosis, neurofibromatosis 1, and fragile X syndrome were similar, with 30.6–36.1% answers scored as complete. Hereditary breast and ovarian cancer questions had the fewest complete answers (25.0%). Of the eight wrong answers found, half were for questions about hereditary breast and ovarian cancer. Only fragile X syndrome had no wrong answers. The differences in number of wrong answers among the five conditions were not statistically significant (Cochran's Q test; P = 0.23).

Fig 3
figure 3

Answers by condition.

Finally, we examined the data by question type (signs and symptoms, diagnosis, management, inheritance) (Fig. 4). Significant differences were seen for number of complete answers (Cochran's Q test; P = 0.003) and for wrong answers (Cochran's Q test; P = 0.05). Fifty percent of the wrong answers were in the diagnosis category and 25% each were in the management and inheritance categories. There were no wrong answers in the signs and symptoms category. The only significant pairwise difference for complete answers was between signs and symptoms (17 of 45; 37.8%) and management (11 of 45, 24.4%) (McNemar's test; P = 0.03). No pair-wise comparisons for wrong answers were significant at the 0.05 level.

Fig 4
figure 4

Answers by question type.

Efficiency

When we looked at time per any answer found (regardless of the accuracy of the information), GeneReviews and UpToDate were the most efficient at <3 minutes per answer (Table 5).

Table 5 Database efficiency

Using the more rigorous standard of “time per complete answer found,” the efficiency of GeneReviews' was 3.2 minutes. The next most efficient resource was UpToDate at 4.5 minutes. No other database was below 7 minutes per complete answer found. All of the databases exceeded the critical 2 minute per question standard by at least 50%.

DISCUSSION

In an ideal world, the information needed to care for a patient would be accessible, clinically relevant, accurate, and found quickly. When new or complex issues arise, clinicians need an information resource to assist with appropriate evaluation, management, and referral. They often look for point-of-care answers in secondary sources that have demonstrated authority.11,12 It has been proposed that they are using the authority of the source as a proxy for accuracy, trusting that the author or editor has validated the information.11

This study examined nine online resources in January 2007. Overall, there was a lack of genetic information across the databases. Answers were scored not found as frequently as they were scored complete. No database was efficient enough to meet the time demands of the typical busy primary care clinician.

The best resource was GeneReviews. It provided 14 complete answers and four partial answers to the 20 questions, with no wrong answers. As an open access database, it is freely available to anyone. GeneReviews was also the most efficient database judged by either the “any answer” or the “complete answer” criterion. However, it still took 3.2 minutes per complete answer, which is too long for a primary care setting.14,15

OMIM did especially poorly on questions about management and inheritance (data not shown). However, this database describes itself as a catalog of human genes and genetic disorders. It is not intended as a clinical management resource. In addition, because OMIM is written for use primarily by genetics professionals, it assumes that the user has some basic genetics knowledge. This caused OMIM to be rated as vague for three of the four inheritance questions.

Each of the nongenetics databases describes itself as intended to assist with clinical decision support at the point-of-care. The best of them, UpToDate and First Consult, each answered about half of our questions completely. However, they were less accurate and less efficient than GeneReviews (Fig. 2 and Table 5). All of the nongenetics resources are accessible only by subscription or membership.

Limitations

The primary goal of this study was to evaluate point-of-care databases for completeness. We continued searching even after finding a partial answer, searching for the most complete answer possible. As a result, efficiency is probably underestimated for each database. Balancing that, however, using the fastest individual searcher's time biases the data toward more efficiency. The primary reviewers routinely use all the databases except InfoRetriever and DynaMed, minimizing any bias toward faster searches in individual databases.

We have not formally established the relevance of our 20 questions. However, they were written by active clinicians with experience and training in primary care and genetics. We believe they represent realistic issues that may arise in a clinical encounter with a nongeneticist.

Our scoring criteria assumed that a user has no more than minimal genetic knowledge. Less rigorous criteria probably would have resulted in fewer partial or vague answers and more complete answers. This is especially true for the two genetics resources, GeneReviews and OMIM.

We studied only 20 questions per database. This may have limited our ability to detect differences between resources.

These findings are applicable only for the selected databases as they existed on the World Wide Web in January 2007. These resources have undoubtedly changed since this evaluation.

Comparison with other studies

Ely et al.12 directly observed primary care physicians and recorded their attempts to answer clinical questions. Answers were found easily 41% of the time, with difficulty 31% of the time, and not at all 28% of the time. In 2001, Alper et al.16 concluded that “point-of-care searching is not yet fast enough to address most clinical questions identified in routine practice.” They studied 14 databases and found that 20–70% of 20 general primary care questions were answered adequately, requiring 2.4–6.5 minutes per answer. They defined “adequate” as good enough to guide clinical decision-making, but the accuracy of the answers was not examined. In 1999, Graber et al.17 searched 16 Web resources for predefined accurate answers to 10 common questions posed by primary care clinicians. The success rate was 0–60%. Our results are similar, finding any answer (without regard to accuracy) 15–95% of the time and complete answers 0–70% of the time.

Implications

It is worrisome that more than half of the databases in this study provided wrong answers. One might assume that a subscription database has greater authority and accuracy than a free resource. However, all of the wrong answers in this study were found in sources that require a paid subscription for access.

The most frequently asked questions in primary care are about diagnosis and management.14,15 Our study found that diagnosis questions had the most wrong answers and management questions had the fewest complete answers. This significantly decreases the utility of the databases for nongenetics providers caring for patients with genetic conditions.

Many World Wide Web databases do not answer clinical questions about genetic conditions accurately. None of the resources we tested are efficient enough for point-of-care use. As genetics becomes more prominent in daily patient care, providers will need an efficient, accurate, and accessible source of information.