|
Critical Appraisal and Causal Inference
Professor John Kaldor, RMA Member
Paper prepared from edited transcripts of the RMA Forum
March 2004
Critical Appraisal and Causal Inference
Introduction
The primary purpose of this presentation is to convey the main scientific ideas behind critical appraisal and
causal inference, the two fundamental processes involved in the creation of a Statement of Principles by the RMA.
The development of the SOP always begins with a literature search. The RMA Secretariat uses computer databases
and a range of other mechanisms to identify all publications that might conceivably qualify as "sound medical-scientific
evidence" of relevance to the SOP. Each publication generally reports the results of a single study that investigated
the role of one or more factors in causing the disease referred to by the SOP.
Once these publications have been gathered together, the process of critical appraisal is applied to each one
in turn. Critical appraisal is a systematic method developed in the field of health research for reviewing the
results of published studies. Although it demands a good understanding of the principles of epidemiology and statistics,
critical appraisal can be undertaken by people who are not specialised in the content area of the research being
reviewed. Indeed, the whole premise of critical appraisal is that a general reader can identify standard features
of studies from their published reports, and that consideration of these features can then allow a judgement to
be made on the quality of the study and the implications of its findings.
At the RMA, the literature search and critical appraisal steps are undertaken between the regular monthly meetings,
and result in what is known as the submission relating to a proposed SOP. At the monthly meetings, the RMA members
jointly examine the submission, and use the methods of causal inference to determine which, if any, of the various
factors that have been considered in relation to a particular disease are causal. Causation can be determined at
the reasonable hypothesis or balance of probabilities level.
What is a cause?
Before discussing critical appraisal and causal inference in more detail, it will be useful to consider what
we actually mean by the cause of a disease or an injury. For some conditions, causes seem obvious: for example,
a vehicle crashes, a person who was perfectly healthy before the crash is rescued and is found to have a fracture.
In this situation, it seems indisputable that the crash caused the fracture. There is no reason to look for other
causes, even though there is a remote probability of another factor having been responsible. Nor do we feel compelled
to do a literature search to find whether there are any published studies that compare the numbers of fractures
in people who have just experienced a road crash with the numbers in those who have not!
Similarly, if a person goes on a long march on a hot day and at the end of the day has blistered feet and a
headache, there is little doubt about the causal pathways. It is entirely reasonable to conclude that ill-fitting
footwear caused the blisters and dehydration from sun exposure caused the headache. Causation is unequivocally
demonstrated by the close proximity in time of the causal factor and its effect, in combination with
the obvious physical pathways that provide the connection between the two.
Defining a factor as a cause of a disease is much more complicated for diseases that occur a long time after
exposure to the factor. If a person is exposed to herbicide in the context of Vietnam service, or some other situation,
and then 20 years later develops leukaemia, was exposure to the herbicide the cause?
We know little about the causes of leukaemia, and the person could have been exposed to many other factors both
before and after the herbicide exposure. In these situations, a factor can only be determined to be a cause of
the disease in question if there is corroborating evidence from epidemiological studies. Specifically, we need
studies that determine whether there was a higher rate of disease occurrence in those exposed to the factor than
in those who were not. Although "clinical judgment" and animal experiments may play some role, we rely
almost entirely on the human epidemiological studies to tell us whether there was a difference in the disease rates
between those exposed and unexposed to the factor.
These studies are essentially measuring the relative probability of the disease occurring in people who are
exposed, compared to those who are unexposed. If the probability of disease is higher in the exposed group, then
we can say there may be an association between the factor and the disease. Once an association has been established,
we can then go to the next step and consider whether it likely to be causal. It is important to note that a factor
may be associated with a disease, without actually being a cause.
Through the science of epidemiology, a number of different methods have been devised for comparing people who
are exposed to a factor with those who are not, and determining whether exposure is associated with the development
of disease. The various methods have various strengths and weaknesses, and none is perfect, nor applicable in all
situations. They do, nevertheless, generally result in the estimate of a measure of association, usually one known
as the relative risk.
If there is no association at all between the factor and disease, the resulting relative risk is one (Figure
1). Simply stated, the amount of disease in the exposed group is the same as in the unexposed group, and the ratio
resulting from dividing one by the other is one. A relative risk substantially above one provides a strong suggestion
that exposure may be causally related to the disease. For example, a relative risk of three means that people in
the exposed group were found to have the disease three times more often than those in the comparison group.
Figure 1. Relative risk

Sometimes studies show that the amount of disease in the exposed group is less than the amount in the unexposed
group. This finding suggests that exposure to the factor may actually be protective, in the sense that it reduces
the chance that the disease will occur.
The RMA does consider protective factors, but more often is faced with studies that report factors that seem
to have a weak or moderate effect, in that the relative risk is increased above 1.0, but not by very much. Most
often, relative risks are reported in the range 1.25 up to about 1.75 (indicating increases in risk from 25 up
to 75%), providing some suggestion of causality, but still leaving room for doubt unless the studies are of very
high quality, or there are a number or studies of reasonable quality all showing a consistent result. Relative
risks that are well above two are hard to dispute as providing strong evidence of causality. There would usually
have to be some major weaknesses or inconsistencies in available studies for the RMA not to draw a conclusion of
causality in this circumstance.
Steps in critical appraisal
Evaluation of study quality is the role of critical appraisal. The published record of each study provides the
raw material, and a systematic approach to reading the paper is taken to examine exactly what methods were used
to conduct the study, and thereby assess the likelihood that the published relative risk is valid.
In carrying out its critical appraisal, the RMA depends crucially on the quality of published reports. Even
if a study was of fundamentally high quality and published in a prestigious journal, it is not always clearly described
in the written report. A considerable amount of experience in critical appraisal is required before a practitioner
can confidently decide what has been left unsaid or partially stated in a published report.
When conducting a critical appraisal of a published report, the first step is to determine what the author of
the paper was investigating. Usually there will be a stated research question, such as "does exposure to herbicides
cause leukaemia?", stated either in the title of the paper, the abstract, or the introduction.
Next, it is essential to decide what particular study design is being used by the researchers to answer the
question. Generally, the design will be a randomised trial, a cohort study, a case control study, a cross-sectional
study or a correlational study. Many publications will state the design, but a number do not, and specialised knowledge
is required to be able to infer what design was used.
Each type of study has different qualities and weaknesses as an epidemiological information gathering device.
Some of them have strengths in being able to measure the levels of exposure, while others are more suited to accurate
assessment of disease rates. They all have fundamental limitations, and none is perfect.
Having identified the study design, we proceed to considering what factors are being investigated as possible
causes of the disease, and how they are being measured. Ideally, a study has been able to measure the exposure
to the factors directly but, more often, epidemiological investigation relies on surrogate measures of exposure.
For example, to determine whether a person was exposed to a herbicide, the most direct approach would be direct
measurement of air or blood levels at the time of exposure, but such information is rarely available. The study
may therefore have relied on indirect measures, such as the self-report of the study participants (often years
after exposure occurred).
Critical appraisal also requires that the reviewer of a paper determine how the study assessed the presence
or absence of the disease of interest. Again, the ideal will be direct, objective determination by the research
team, but there are some diseases that can only be assessed indirectly. For example, a study of headache relies
on self-report to make the diagnosis.
The main results of a study are generally presented as relative risks, accompanied by some measure of their
statistical precision. For this purpose we use confidence intervals, which give an indication of the range that
there might be around the estimate of the relative risk. Another statistical adjunct is the P-value, which assists
in judging whether any observed association may be a chance finding.
Once the basic elements of the study have been established as clearly as possible from the published report,
the process of critical appraisal then involves assessing the extent to which chance, bias and confounding may
have operated as alternative explanations of the reported findings.
If we see a relative risk that is clearly increased above one, there are generally three broad alternatives
that must be excluded.
First, it could be a chance finding. Perhaps the people exposed to the factor were at risk of getting the disease
more than the people who were not exposed, for reasons completely unrelated to the presence of the factor, for
example their genetic makeup. Although generally impossible to prove, when studies are very small, chance can never
be ruled out as the reason for the observed difference.
A second explanation is that the study was subject to some form of methodological bias. Ideally study measurements
are carried out in an objective way, but in practice, a subjective element is often introduced as a result of the
study design. Consider, for example, studies that collect information on past exposure from people who have a serious
illness and a comparison group of people without the illness. It is entirely plausible that the recollections and
perceptions of the two groups may differ, even if they in fact had identical exposure histories.
The third circumstance that might give rise to apparent associations is the phenomenon of confounding, whereby
people exposed to the factor of interest differ systematically from those unexposed, with regard to some other
factor that is actually a cause of the disease. Critical appraisal provides a standard framework for reviewing
published reports of studies, and determining the potential role of one or more of these alternative explanations.
With so many papers published in so many journals, by so many different authors using so many different approaches,
critical appraisal has become an essential tool for reducing the resulting information to a common, comparable
form that allows it to be synthesised in an objective manner.
Evidence for causation
Finally, the RMA takes the information resulting from the critical appraisals of all available studies, and
forms a judgement as to whether there is a basis for determining causal relationships between specific factors
and the disease under consideration. At this point, there are a number of criteria that can be considered, but
there is no simple formula for making the determination.
Most important seems to be the strength and consistency of an association. If ten studies of the relationship
between a factor and a disease all show a relative risk of four or five, it is very likely to be identified as
a causal factor, at both the reasonable hypothesis and balance of probabilities level.
Big relative risks are hard to dismiss as having alternative explanations, and generally provide strong support
for causality, but they are not necessary for the determination of causality. A factor can increase the probability
of disease by twenty per cent or less, and still be a true cause. In this situation, other criteria are often assessed
to support causality.
If a study has been able to quantify exposure in some way, rather than simply classifying participants as exposed
or unexposed, then it may be possible to examine the relationship between exposure levels and the relative risk.
A steady increase in risk with increasing exposure, sometimes referred to as a dose-response, provides evidence
in favour of a causal relationship (Figure 2).
Figure 2. Dose-response effect
It is also crucial to assess whether the studies provide clear evidence that the exposure was present before the
disease. Although this criterion seems obvious, several study designs (case-control and cross-sectional) involve
assessing exposure at the same time as the disease is diagnosed. It is not always possible to determine from the
information presented that exposure indeed occurred before the onset of disease.
Another important aspect of judging causality is the plausibility of the linkage in terms of the known biology
of the disease, and its overall coherence with the reported epidemiological findings. Although we clearly do not
understand the biological basis of all relationships that we observe, we would be very cautious about attributing
causation in circumstances that directly contradicted our understanding of biology.
Conclusion
The RMA draws on its ever-accumulating experience to try to establish some consistency in the process of drawing
together critical appraisals of published literature and making causal inferences about the relationships between
specific factors and diseases. As you will be aware, we work under legislation that suggests that we should look
for reasons to attribute causality if we can and that is indeed what we do.
As a result of this underlying philosophy, we tend to make judgements of causality, particularly at the reasonable
hypothesis level, on the basis of somewhat weaker evidence than might be accepted in other contexts. Nevertheless,
we aim to do so in a consistent manner, using comprehensive and systematic reviews of all the information available.
There is no compromise to the overall scientific integrity of the process.
Previous | Contents | Next
This page last updated 22 March 2005.
|