Key points are not available for this paper at this time.
Increasing the uptake of research findings by health librarians depends on two core skills.1 At the production end of the research process, researchers need to synthesize the results of well-conducted research into integrative, systematic reviews. At the consumer end of that same process individual librarians need to appraise the methods and results of individual articles in a critical way assessing their value in terms of their reliability, their validity and, most importantly, their applicability to practice. Preliminary work by the LINC Health Panel Research Working Party indicates that the potential of systematic reviews in health librarianship is constrained by a number of factors. These include different methods used to address similar questions (technically known as ‘heterogeneity’), the poor quality of research designs, and deficiencies in indexing and abstracting that make identification and retrieval of candidate studies problematic.2 A number of initiatives and methodological approaches offer a way forward for systematic reviews in our discipline (and these will be summarized in a future research column). However, the Research Working Party is currently concentrating on a more immediate goal—the transfer and adaptation of established methods of critical appraisal to the health information literature. CRItical Skills Training in Appraisal for Librarians (CRISTAL) is an unfunded pilot project being undertaken by Anne Brice (Health Care Libraries Unit, Oxford) and Andrew Booth (School of Health and Related Research, Sheffield) to develop and trial tools for critical appraisal appropriate for research studies appearing in our professional literature.2 It will be helpful to chart a brief history of critical appraisal of the medical literature to place the specific requirements of health information literature into context. Starting in 1993 and running through to 2000, JAMA has periodically published Users’ Guides to the Medical Literature to assist clinicians in interpreting the findings from the research literature.3 Typically these have either addressed a specific type of question (for example therapy, diagnosis or causation) or a specific study design or publication type (for example a guideline, a systematic review or a clinical decision analysis).4 The Users’ Guides do, however, have two major limitations. First, their medical pedigree makes them less accessible to other disciplines that are increasingly required to practice critical appraisal. Second, the technical language that they use may act as a barrier to their ready adoption by those whose prior familiarity with research design or statistics is limited. The Critical Appraisal Skills Programme (CASP) in Oxford5 addresses both these limitations while remaining true to the scientific basis of the Users’ Guides. It uses multidisciplinary scenarios to help disciplines other than medicine to relate to the appraisal process. It also strips the criteria used by the Users’ Guides to the bare essentials, focusing on key issues and supporting them with explanatory notes. So, for example, for therapy scenarios they have produced ‘11 questions to help make sense of a trial’. CASP has also pioneered efforts to reach people without a clinical background such as consumer groups and members of representative bodies such as Community Health Councils and Maternity Services Liaison Committees. In addition it broadens its definition of evidence to include qualitative research, devising a checklist specifically for such studies. Checklist development is an ongoing process for the CASP team—in August 2000 they announced the availability of a revised version of their systematic reviews checklist.6 Initiatives such as CASP and its sister programme South Thames Research Application Programme (STRAP) have involved librarians in supporting training programmes on critical appraisal. Elsewhere attempts have been made to teach librarians themselves to use critical appraisal skills (as in courses run by ScHARR such as MICADoH-Department of Health, CAST@NET—North Thames and CATALIST-Trent). Yet other librarians have attended generic week-long Teaching Evidence-based Practice workshops across England and Wales. The net result has been a broad skills base on which to build evidence-based librarianship. Are the tools available to finish the job? Certainly there is no reason why a health information topic that is addressed by a randomized controlled trial or by a qualitative study cannot be assessed using a generic appraisal checklist from the Users’ Guides or CASP inventory. Indeed this column concludes with a brief example of such an appraisal for a mainstream medical article. However the acknowledged dearth of articles that use rigorous designs in the health information literature makes it preferable to use a pragmatic ‘best available evidence’ approach rather than create a set of purist ‘best evidence’non-Users’ guides! The CRISTAL team therefore began by trying to identify common question types from the health information literature that might be analogous to the questions asked by clinicians (so, for example, instead of diagnosis librarians need to answer questions about user needs). A two-pronged approach was used whereby one investigator examined existing checklists to identify criteria of potential application to the user needs literature (a deconstructing approach) while the other started with a user needs research paper and identified questions that might need to be asked of it (a constructing approach). The two lists of criteria were then aggregated, debated, refined and reduced to produce a draft, although slightly unwieldy, instrument for piloting. The user needs instrument has been presented to a group of librarians for initial comment and is about to undergo more formal piloting. Copies of the draft instrument are available from the column editor on request (A.Booth@sheffield.ac.uk) and the team would welcome suggestions for future types of question to be addressed by the appraisal checklist approach. In the meantime there is a substantial proportion of our evidence base yet to be captured through use of established critical appraisal instruments. Suppose, for example, you were to read an article entitled ‘Randomised controlled trial comparing effectiveness of touch screen system with leaflet for providing women with information on prenatal tests’7 in this year’s BMJ. What might you infer from this article about the relative merits of printed and electronic approaches in providing information? Consider the following critically appraised topic (CAT). Question. In women booking antenatal care (population) is a touch screen system alongside a leaflet (intervention) more effective in terms of uptake, understanding, satisfaction with information and levels of anxiety (outcomes) than a leaflet alone (comparison)? Design. Randomised controlled trial; intervention group (touch screen and leaflet), control group (leaflet only). Setting. Antenatal clinic in university teaching hospital. Subjects. 875 women booking antenatal care. Outcome measures. Informed decision making on prenatal testing as measured by uptake and understanding of five tests, satisfaction with information received and anxiety levels as measured by the Spielberger state-trait anxiety inventory (STAI). Results. The only significant difference in uptake was that more women in the touch screen group underwent detailed anomaly scanning (P = 0.0014). Both groups showed significant improvements in knowledge over baseline (16 weeks gestation) by the time of second questionnaire (20 weeks gestation). Both groups reported high levels of satisfaction with leaflet with over 95% of the touch screen group also reporting that they would recommend the touch screen to other pregnant women. Compared with the baseline questionnaire, anxiety had declined significantly in the touch screen group mainly amongst ‘first-time’ (nulliparous) pregnancies. Commentary. A major problem with any experimental information-based intervention is the high level of dropouts during the course of the study with the likelihood of this increasing with each successive round of questionnaires. This is clearly seen in the flowchart of progress of participants through the trial (a feature of good practice in reporting based on the CONSORT statement). So, of 1477 invited to participate, 280 declined and a further 147 were ineligible. Of the 1050 actually randomized a further 175 dropped out without filling in the baseline questionnaire. So nearly 41% of potential subjects had dropped out even before the first measurements were taken, This attrition continued with a further 104 dropping out at the time of the second questionnaire and 37 dropping out at the third and final questionnaire. Clearly there must be concerns about the applicability (or indeed practicability) of such an intervention in practice. Another major limitation is that 47% of participants had received higher education making the study population unrepresentative of the population at large. The authors’ own statement is significant ‘Like all new technologies, these devices should be subject to rigorous evaluation’7 whilst in the accompanying commentary Jeremy Wyatt concludes ‘with limited evidence of benefit for these expensive tools over well designed leaflets they seem to fit best into the National Institute for Clinical Excellence (NICE) category C: for NHS use only in the context of rigorous research studies’.8 How is the evidence base of librarianship informed by such a study? Obviously most readers of Health Libraries Review will not be involved in the provision or use of touch screens for antenatal women. However, is there a counterpoint here to our often wanton embrace of new technologies? Health librarians, myself included, rushed lemming-like to purchase CD-ROM systems in the late 1980s and early 1990s with no evidence that they would actually improve the quality of information retrieval. In fact there is some evidence, albeit limited, that the print form may be, in some respects, markedly more effective. To effectiveness we can add countless other important qualitative questions around preferences, attitudes and willingness to pay. I remember how even with the advent of the ‘wonderful new CD-ROM’ one senior consultant resolutely refused to change from his monthly routine of scanning printed Index Medicus—ironically for his pet topic of computer assisted decision making! Not wishing to spend an inordinate amount of time stuffing dead ‘red herrings’9 one could move the debate forward to the current choice between the print-based British National Formulary and its CD-ROM, and now Web-based counterparts.10 Where are the ‘rigorous research studies’ and the ‘rigorous evaluation’ that are going to prevent a thousand health librarians across the UK from making a wrong or, at the very least, an uninformed decision about such an acquisition issue. The fact that the WeBNF is currently available free of charge on the Internet can be seen as an irrelevance if one includes the potential cost of not retrieving the right information within a broader evaluation framework. Should we be moving as a profession to the starting point ‘prove to me that it is better’ rather than the acquiescent ‘well it doesn’t seem to be any worse’? Granted the topic of the above critical appraisal may be of limited direct applicability. This is not always the case however, a randomized controlled trial of end user training illustrates that much more central areas of our ‘business’ are also addressed by research.11 Hopefully this brief example shows that even just starting to adopt a questioning critical attitude and supporting this with the easily acquired skills of appraisal moves us towards seeking to include our own new health information technologies within the prevailing climate of health technology assessment. In fact the above mentioned Jeremy Wyatt is already looking at incorporating the National Electronic Library for Health within such an evaluative framework. Perhaps this could encourage us all to replace our rose-coloured pince-nez with a more apposite magnifying glass!
Andrew Booth (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: