Thorax 2010 Ralph 863 9
Thorax 2010 Ralph 863 9
Thorax 2010 Ralph 863 9
com
Tuberculosis
ABSTRACT
Background The grading of radiological severity in
clinical trials in tuberculosis (TB) remains unstandardised.
The aim of this study was to generate and validate
a numerical score for grading chest x-ray (CXR) severity
and predicting response to treatment in adults with
smear-positive pulmonary TB.
Methods At a TB clinic in Papua, Indonesia, serial CXRs
were performed at diagnosis, 2 and 6 months in 115
adults with smear-positive pulmonary TB. Radiographic
findings predictive of 2-month sputum microscopy status
were used to generate a score. The validity of the score
was then assessed in a second data set of 139
comparable adults with TB, recruited 4 years later at the
same site. Relationships between the CXR score and
other measures of TB severity were examined.
Results The estimated proportion of lung affected and
presence of cavitation, but not cavity size or other
radiological findings, significantly predicted outcome and
were combined to derive a score given by percentage of
lung affected plus 40 if cavitation was present. As well
as predicting 2-month outcome, scores were
significantly associated with sputum smear grade at
diagnosis (p<0.001), body mass index, lung function,
haemoglobin, exercise tolerance and quality of life
(p<0.02 for each). In the validation data set, baseline
CXR score predicted 2-month smear status significantly
more accurately than did the proportion of lung affected
alone. In both data sets, CXR scores decreased over time
(p<0.001).
Conclusion This simple, validated method for grading
CXR severity in adults with smear-positive pulmonary TB
correlates with baseline clinical and microbiological
severity and response to treatment, and is suitable for
use in clinical trials.
INTRODUCTION
Sputum smear microscopy, and culture where
available, are standardised modalities for diagnosing
and monitoring treatment response in pulmonary
tuberculosis (TB). Chest radiography (CXR)
provides useful information regarding disease extent
and progress, but there is no agreed-upon, validated
system for grading the severity of CXR abnormalities in bacteriologically proven pulmonary TB.
Several methods were devised for this purpose at the
time of early TB treatment trials, such as those
described by the Madras TB Chemotherapy Centre
METHODS
Study setting
The study was conducted at a community-based
TB clinic in Timika, Papua Province, Indonesia.
Timika has population of w200 000 and an
estimated TB incidence of 311/100 000.25
863
Tuberculosis
Participants
Adults (>15 years) diagnosed with sputum smear-positive
pulmonary TB who gave written informed consent were eligible
for enrolment in the study. Study participants were recruited
during two time periods: 2003e2004 (the derivation data set)
and 2008e2009 (the validation data set). The demographic,
clinical and microbiological ndings and outcomes in the rst
data set have been reported previously.26 27
Chest radiography
Standard full-size posteroanterior CXR were performed at the
time of TB diagnosis and 2 and 6 months thereafter, with
reports provided by a clinician at the eld site (rst data set,
PMK; second data set, APR) and, additionally for the rst data
set, by one of two radiologists (MJW or GD). During the rst
data collection period, the presence of small (1e2 mm) or large
(>2 mm) nodules, patchy or conuent consolidation, cavitation,
bronchial lesions or brosis was reported for each of three zones
(upper, mid or lower zones) in each lung. The presence of effusion or lymphadenopathy was reported, the total percentage of
each lung affected by any pathology was estimated, total cavity
size in millimetres was recorded and the effusion volume
(percentage of lung eld) was estimated. To grade the percentage
of affected lung, visual estimation of the extent of opacication,
cavitation or other pathology as a percentage of visible lung was
made; dense opacication of a zone was graded as 100% of that
zone, while patchy opacication within a zone attracted scores
<100% depending on the extent of opacication. Other remarks
including presence of miliary disease were recorded. During the
second data collection period, a simplied CXR report method
was used (percentage lung affected, cavitation (0, <4 cm,
$4 cm), effusion (0, <25%, $25% of hemithorax), presence of
consolidation, brosis, nodules, miliary disease). Reporters were
blinded to HIV status, bacteriological and clinical parameters
and treatment outcome.
Outcome measure
The outcome measure used in this study is 2-month sputum
AFB microscopy status. Two-month smear positivity has been
previously shown to predict unfavourable outcomes including
treatment failure and death,32e34 and determines the need for
continued intensive-phase treatment versus switching to
continuation-phase therapy.30 Although an imperfect predictor
864
Data analysis
Statistical calculations were performed using Intercooled Stata
10.1 (StataCorp, College Station, Texas, USA); graphs were
created in GraphPad Prism 5 (GraphPad, La Jolla, California, USA).
Statistical tests were two sided, with a p value of <0.05 indicating
statistical signicance. Intergroup differences in means or medians
were compared using two-sample t tests, Wilcoxon rank sum
tests, analysis of variance or KruskaleWallis tests as appropriate.
Agreement between reporters in the derivation data set was
tested using the concordance coefcients, rc for continuous
variables or the kappa statistic for categorical variables. Prevalence-adjusted, bias-adjusted kappa values were calculated
according to the method described by Byrt et al.36 Kappa values
were interpreted according to guidelines given by Landis and
Koch37 (kappa #0.00, poor; 0.00e0.20, slight; 0.21e0.40, fair;
0.41e0.60, moderate; 0.61e0.80, substantial; 0.81e1.00, almost
perfect).
The relationships between radiographic ndings and clinical
outcome were examined by multivariable regression analysis,
using a forward stepwise approach in which any radiological
variable found to be signicant (p<0.05) in univariate analysis
was included in the initial model. Goodness of t of nal models
was assessed using the HosmereLemeshow test and compared
using the likelihood ratio test. The weighting for a numerical
radiological score was derived from the regression coefcients.
Its ability to predict outcome in the validation data set was
determined using receiver operator characteristics (ROC; area
under the curve (AUC)). The relationships between this score
and demographic, biological and clinical variables were determined in data sets 1 and 2 using regression models using the
same principles.
Ethics
Approval was granted by the ethics committees of the National
Institute of Health Research and Development (Jakarta,
Indonesia), Menzies School of Health Research (Darwin,
Australia) and the Australian National University (Canberra,
Australia). Written informed consent was obtained from
participants in Indonesian or an appropriate Papuan language.
RESULTS
Characteristics of study participants in the two data collection
phases are shown in table 1. All participants had smear-positive
pulmonary TB ($2 AFB smear-positive sputum samples); the
result of an additional sample provided for microscopy and
culture on the day of treatment commencement is reported here.
This was negative in 5.7% and 7.2% of participants in the two
data sets, respectively despite their previous samples being
positive. Initial smear grade predicted the likelihood of smear
conversion by 2 months. In the derivation data set, failure to
convert to smear negative by 2 months was observed in 60.9% of
patients with a baseline smear grade of 3 and in 38.7% of
patients with a baseline smear grade of <3+ (p0.051). In the
validation data set, failure to convert to smear negative by
2 months was observed in 48.4% of patients with a baseline
smear grade of 3 and in 11.8% of patients with a baseline smear
grade of <3+ (p<0.001).
CXR reports were available at baseline, 2 and 6 months for
112, 76 and 76 study participants in the rst data set, and 136,
93 and 76 study participants in the second data set (incomplete
in the second data set as 30 of 139 had not yet completed
Thorax 2010;65:863e869. doi:10.1136/thx.2010.136242
Tuberculosis
Table 1
Demographic details
Number
Age in years: median (range)
Female gender, n (%)
Papuan ethnicity, n (%)
Smokers, n (%)
HIV positive: no./no. tested (%)
MDR-TB, n (%)
Baseline clinical findings
BMI, median (range), kg/m2
Haemoglobin, median (range), g/dl
FEV1: median (range), litres
SGRQ total score, median (range)
6 min walk distance, median (range) m
Sputum AFB smear grade at diagnosis n
0*
Scanty or 1+
2+
3+
2-month smear status n (%)
Positive
Negative
No result available
6-month outcome n (%)
Cured/completed
Died
Failed
Default
Transferred
6 months not yet completed
Table 2
115
30 (17e69)
33 (28.7)
57 (49.6)
38 (33.0)
5/112 (4.5)
2 (1.7)
139
27 (15e65)
48 (34.5)
66 (47.5)
41 (29.5)
16/121 (13.2)
2 (1.4)
18.6 (14.2e25.2)
11.2 (6.8e18.0)
1.76 (0.49e4.12)
45.3 (2.5e83.5)
405 (185e625)
(%)
6 (5.7)
25 (23.6)
28 (26.4)
47 (44.3)
19.0 (12.9e32.5)
12.2 (7.1e16.0)
1.70 (0.59e3.56)
37.8 (5.2e91.9)
410 (20e612)
25 (21.8)
81 (70.4)
9 (7.8)
31 (22.3)
95 (68.4)
13 (9.3)
88 (76.5)
3 (2.6)
2 (1.7)
13 (11.3)
9 (7.8)
0
92 (66.2)
2 (1.4)
1 (0.7)
8 (5.8)
6 (4.3)
30 (21.6)
10
65
35
29
(7.2)
(45.7)
(25.2)
(20.9)
*All study participants had at least two prior smear-positive sputum samples; some are
reported as negative since this result pertains to the additional spot specimen provided at
enrolment into the study (see the methods section).
AFB, acid-fast bacilli; BMI, body mass index; FEV1, forced expiratory volume in 1 s;
MDR-TB, multidrug-resistant tuberculosis; SGRQ, St Georges Respiratory Questionnaire.
Validation
data set
136
93
76*
104 (76.5)
77 (56.6)
24 (17.3)
5 (3.7)
29 (21.3)
10 (7.4)
41.5 (0e100)
69 (0e140)
29 (0e140)
10 (0e115)
38.8
80.2
10
66
76
85
(0e133)
(0e140)
(11e140)
(4e140)
Inter-rater agreement
among dichotomous
variables
Kappa
Prevalenceadjusted,
bias-adjusted
kappa
Presence
Presence
Presence
Presence
Presence
Presence
Presence
Presence
Presence
Presence
0.20
0.33
0.24
0.12
0.09
0.19
0.08
0.33
-0.35
0.01
0.70
0.31
0.81
0.06
0.09
0.19
0.06
0.37
-0.61
-0.09
of
of
of
of
of
of
of
of
of
of
patchy consolidation
confluent consolidation
any consolidation
small nodules
large nodules
any nodules
fibrosis
cavitation
effusion
lymphadenopathy
Interpretation
of prevalenceadjusted,
bias-adjusted
kappa
Substantial
Fair
Almost perfect
Slight
Slight
Slight
Slight
Fair
Poor
Poor
Concordance among
continuous variables
rc
0.85
0.69
28.20 to 22.46 %
56.62 to 50.66 mm
865
Tuberculosis
Table 4 Relationship between radiological and biological parameters
and 2-month sputum acid-fast bacilli density in the derivation data set,
showing results of univariate logistic regression analyses*
Radiological parameters
Presence of any consolidation
Per 1% increments
Per 20% increments
Cavitary disease
Effusion
Nodules
Fibrosis
Lymphadenopathy
Odds ratio
95% CI
1.30
1.90
3.26
1.50
1.17
2.23
1.43
0.14
1.30
1.11
0.59
0.41
0.86
0.57
to
to
to
to
to
to
to
p Value
12.18
2.68
9.56
3.82
3.32
5.79
3.56
0.001
0.001
0.032
0.590
0.773
0.097
0.448
*OR was unable to be calculated for the two instances of military disease (both had
negative sputum at 2 months).
Figure 2 Chest x-ray (CXR) score according to body mass index (BMI),
percentage of predicted forced expiratory volume in1 s (FEV1) and
haemoglobin (Hb) at diagnosis (derivation data set).
signicantly associated with age in univariate or multivariate
analyses. Mean baseline CXR score in people with unfavourable
(positive) 2-month outcomes was signicantly higher (88.2; 95%
CI 76.5 to 99.9) than in those with a favourable outcome (56.8;
95% CI 49.7 to 64.0), but the range of scores in each smear grade
was wide (gure 1). Scores were also signicantly associated
with baseline microscopy grade (gure 1). CXR scores were
inversely related to BMI, FEV1, Hb and 6 min walk distance,
were directly related to SGRQ total score (higher SGRQ scores
indicate worse quality of life) and signicantly decreased over
time (gures 2e4).
Tuberculosis
Figure 4 Chest x-ray (CXR) score at
diagnosis, 8 weeks (end of intensive
treatment phase) and 24 weeks (end of
treatment).
were found in the initial data set between CXR score and each of
the clinical/laboratory measures (BMI, Hb, FEV1, SGRQ total
score and 6 min walk distance; p<0.05 in each case).
Comparing ROC scores to predict outcome, the weighted
CXR score (AUC 0.75) was signicantly better at predicting
2-month smear status than the percentage lung affected alone
(ROC 0.69; p0.013, c2 test; gure 5). The optimal cut-off point
for weighted CXR score (value furthest from the diagonal) was
71, at which value the sensitivity for predicting a positive
sputum smear status at 2 months was 80% (95% CI 61.4 to
92.3) and specicity 67.7% (95% CI 57.3 to 77.1). Comparative
sensitivity and specicity values are shown in table 5.
DISCUSSION
0.50
0.00
0.25
Sensitivity
0.75
1.00
0.00
0.25
0.50
0.75
1.00
1-Specificity
%Lung affected ROC area: 0.6867
Reference
Optimal cut-point
score for each CXR. The score shows good correlation with
baseline bacteriological and clinical severity markers, and is
sensitive to changes over time. The score performs better than
its individual components: it was signicantly better at
predicting outcome than was the percentage of lung affected
alone, and was signicantly associated with a broader range of
baseline severity measures (BMI, |Hb, exercise tolerance and
quality of life) than presence of cavitation alone. Advantages of
this method are that CXR assessment does not require aids, grids
or rulers, and it is derived by tting a statistical model to
outcome data rather than by assigning points based on assumed
relative importance of radiographic pathologies. It has been
Specificity
35.9
68.0
89.7
44.9
18.0
71.2
98.7
22.6
62.4
87.1
52.7
22.6
67.7
91.4
61.3
88.2
867
Tuberculosis
validated in an independent data set, and offers a single, standardised solution where there are currently multiple unvalidated
methods in use.
The proportion of lung affected and/or cavitation feature as
the most important measures in many TB CXR grading methods.1e5 7 Cavitation is well recognised to correlate with bacillary
load.7 39 We conrmed the association between cavitation and
bacteriological measures (baseline and 2-month sputum smear
status), and additionally showed cavitary disease to be predictive
of worse lung function. The proportion of lung affected was
associated with both bacteriological and a range of clinical
measures.
This score was derived in adult patients with TB with smearpositive pulmonary disease, in a setting with relatively low rates
of HIVeTB co-infection and MDR-TB. The score requires
further evaluation in populations with high HIV prevalence, in
whom CXR ndings characteristic of HIVeTB co-infection
(subtle or absent pathology, non-cavitary disease, lower lobe
inltrates, hilar lymphadenopathy and pleural effusion)20 40 may
mean that a differently weighted score is needed. Nevertheless,
the score remained valid and applicable in the newer data set in
which HIVeTB co-infection rates were higher (13%); the rise in
HIV prevalence may account for some of the differences
observed between the two data sets. The presence of MDR-TB
would not be expected to alter radiographic patterns, other than
being associated potentially with higher scores and smaller
incremental improvements over time.
Potential limitations of the study include the use of 2 month
smear status as an outcome measure (rather than a longer term
measure such as 6-month outcome or recurrence).
The absence of suitable biomarkers or other surrogate end
points in TB research is readily acknowledged, and recent
estimates derived from meta-analysis found a sensitivity of only
57% and specicity of 81% for 2-month smear status in
predicting treatment failure.35 Nevertheless, until more suitable
measures become available, 2-month smear status remains
a suitable outcome measure.30 32e34
Another limitation was the inherent problem of limited
inter-rater agreement in CXR assessment. The low rates of
clinicianeradiologist agreement between reporters on CXR
ndings identied in the derivation data set are not unusual, with
only fair or poor agreement between radiologists and clinicians
also being reported elsewhere.22 23 This emphasises the importance of using simple rather than complex scores and ensuring
individuals allocating CXR scores participate in continuing
education to maximise agreement. The score derived from radiologist CXR evaluation in the rst data set is simple. Moreover, it
was shown to be valid in the second data set when used by an
independent TB clinician, rather than a radiologist, conrming its
practical utility in a clinical and trial setting. Some systematic
differences in CXR results were noted between the two data sets;
while this may represent systematic difference in reporting
styles, the ndings are in keeping with the possibility of less
severe disease in the validation data set, as indicated by their
lower bacillary burden (with slides read by the same senior
laboratory technician during both data collection periods).
data did not indicate this. This method can be used where
a numerical score is required for the purpose of comparing
radiographic severity between adults with smear-positive
pulmonary TB, and to monitor an individuals improvement
over time, such as in clinical trials of drug efcacy in TB.
Acknowledgements We thank the following for their support and assistance:
Dr M Okoseray, Pak Penias and Pak E Meokbun and the Timika District Health
Authority; Dr Dina Bisara Lolong and Ibu Meryani Girsang and the National Institute
of Health Research and Development, Jakarta; Dr P Penttinen, Dr M Bangs and
Dr M Stone, Public Health & Malaria Control (PHMC) and International SOS; Pak
Istanto and PHMC laboratory staff; Pak J Lempoy and Timika TB clinic staff;
Dr P. Sugiarto and Mimika Community Hospital (RSMM); Natalia Dwi Haryanti,
Sri Hasmunik, Sri Rahayu, G Bellatrix and clinical and laboratory staff, NIHRD-MSHR
Timika research programme; Mr R Lumb and Dr I Bastian at the Institute of Medical
and Vetinerary Science; and Associate Professor R Price, MSHR.
Funding Australian Respiratory Council, the Royal Australasian College of Physicians
(Covance award), Australian National Health and Medical Research Council.
Competing interests None.
Ethics approval This study was conducted with the approval of the Human Research
Ethics Committees of the NT Department of Health & Families and Menzies School of
Health Research, Australia, the Australian National University, and the National
Institute for Health Research and Development, Indonesia.
Provenance and peer review Not commissioned; externally peer reviewed.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
CONCLUSION
In summary, we have derived and validated a simple method for
grading CXR severity in adults with smear-positive pulmonary
TB that predicts baseline clinical and microbiological severity
and response to treatment in two separate patient populations.
Although ner discriminatory accuracy might be achieved by
collecting more detailed CXR ndings (such as cavity size), our
868
16.
17.
18.
Tuberculosis
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
869
doi: 10.1136/thx.2010.136242
Updated information and services can be found at:
http://thorax.bmj.com/content/65/10/863
These include:
References
Email alerting
service
Topic
Collections
This article cites 36 articles, 4 of which you can access for free at:
http://thorax.bmj.com/content/65/10/863#BIBL
Receive free email alerts when new articles cite this article. Sign up in the
box at the top right corner of the online article.
Notes