A Turbo Code Tutorial

A Turbo Code Tutorial
William E. Ryan
New Mexico State University
Box 30001 Dept. 3-O, Las Cruces, NM 88003
[email protected]
Abstract | We give a tutorial exposition of turbo

codes and the associated algorithms. Included are a
simple derivation for the performance of turbo codes,
and a straightforward presentation of the iterative decoding algorithm. The derivations of both the performance estimate and the modied BCJR decoding
algorithm are novel. The treatment is intended to be
a launching point for further study in the eld and,
signicantly, to provide sucient information for the
design of computer simulations.
I.
Introduction
Turbo codes, rst presented to the coding community in

1993 [1], represent the most important breakthrough in
coding since Ungerboeck introduced trellis codes in 1982
[2]. Whereas Ungerboeck's work eventually led to coded
modulation schemes capable of operation near capacity
on bandlimited channels [3], the original turbo codes offer near-capacity performance for deep space and satellite
channels. The invention of turbo codes involved reviving some dormant concepts and algorithms, and combining
them with some clever new ideas. Because the principles
surrounding turbo codes are both uncommon and novel, it
has been dicult for the initiate to enter into the study of
these codes. Complicating matters further is the fact that
there exist now numerous papers on the topic so that there
is no clear place to begin study of these codes.
In this paper, we hope to address this problem by including in one paper an introduction to the study of turbo
codes. We give a detailed description of the encoder and
present a simple derivation of its performance in additive
white Gaussian noise (AWGN). Particularly dicult for
the novice has been the understanding and simulation of
the iterative decoding algorithm, and so we give a thorough
description of the algorithm here. This paper borrows from
some of the most prominent publications in the eld [4][9], sometimes adding details that were omitted in those
works. However, the general presentation and some of the
derivations are novel. Our goal is a self-contained, simple
introduction to turbo codes for those already knowledgeable in the elds of algebraic and trellis codes.
The paper is organized as follows. In the next section we
present the structure of the encoder, which leads to an estimate of its performance. The subsequent section then describes the iterative algorithm used to decode these codes.
The treatment in each of these sections is meant to be
suciently detailed so that one may with reasonable ease
design a computer simulation of the encoder and decoder.

II.
The Encoder and Its Performance
Fig. 1 depicts a standard turbo encoder. As seen in the

gure, a turbo encoder is consists of two binary rate 1/2
convolutional encoders separated by an N -bit interleaver
or permuter, together with an optional puncturing mechanism. Clearly, without the puncturer, the encoder is rate
1/3, mapping N data bits to 3N code bits. We observe
that the encodersare congured in a manner reminiscent of
classical concatenated codes. However, instead of cascading the encoders in the usual serial fashion, the encoders
are arranged in a so-called parallel concatenation. Observe
also that the consituent convolutional encoders are of the
recursive systematic variety. Because any non-recursive
(i.e., feedforward) non-catastrophic convolutional encoder
is equivalent to a recursive systematic encoder in that they
possess that same set of code sequences, there was no compelling reason in the past for favoring recursive encoders.
However, as will be argued below, recursive encoders are
necessary to attain the exceptional performance provided
by turbo codes. Without any essential loss of generality,
we assume that the constituent codes are identical. Before describing further details of the turbo encoder in its
entirety, we shall rst discuss its individual components.
A. The Recursive Systematic Encoders
Whereas the generator matrix for a rate 1/2 nonrecursive convolutional code has the form GNR (D) =
[g (D) g (D)] ; the equivalent recursive systematic encoder has the generator matrix
g
(
D
)
GR(D) = 1 g (D) :
Observe that the code sequence corresponding to the encoder input u(D) for the former code is u(D)GNR(D) =
[u(D)g (D) u(D)g (D)] ; and that the identical code sequence is produced in the recursive code by the sequence
u (D) = u(D)g (D); since in this case the code sequence is
u(D)g (D)GR (D) = u(D)GNR(D): Here, we loosely call
the pair of polynomials u(D)GNR(D) a code sequence, although the actual code sequence is derived from this polynomial pair in the usual way.
Observe that, for the recursive encoder, the code sequence will be of nite weight if and only if the input
sequence is divisible by g (D): We have the following immediate corollaries of this fact which we shall use later.
1
2
1
Corollary 1. A weight-one input will produce an innite

weight output (for such an input is never divisible by a
polynomial g (D)).
Corollary 2. For any non-trivial g (D); there exists a
family of weight-two inputs of the form Dj (1 + Dq ),
j 0; which produce nite weight outputs, i.e., which are
divisible by g (D): When mg (D) is a primitive polynomial
of degree m, then q = 2 ; more generally, q 1 is the
length of the pseudorandom sequence generated by g (D):
In the context of the code's trellis, Corollary 1 says that
a weight-one input will create a path that diverges from
the all-zeros path, but never remerges. Corollary 2 says
that there will always exist a trellis path that diverges
and remerges later which corresponds to a weight-two data
sequence.
Example 1. Consider the code with generator matrix
GR(D) = 1 1 +1D+ D+ D+ D+ D :
Thus, g (D) = 1+ D + D and g (D) = 1+ D + D + D
or, in octal form, (g ; g ) = (31; 27): Observe that g (D)
is primitive so that, for example, u(D) = 1+ D produces
the nite-length code sequence (1+D ; 1+ D + D + D +
D + D + D + D ). Of course, any delayed version of
this input, say, D (1+ D ); will simply produce a delayed
version of this code sequence. Fig. 2 gives one encoder
realization for this code. We remark that, in addition to
elaborating on Corollary 2, this example serves to demonstrate the conventions generally used in the literature for
specifying such encoders.
B. The Permuter
As the name implies, the function of the permuter is
to take each incoming block of N data bits and rearrange
them in a pseudo-random fashion prior to encoding by the
second encoder. Unlike the classical interleaver (e.g., block
or convolutional interleaver), which rearranges the bits in
some systematic fashion, it is important that the permuter
sort the bits in a manner that lacks any apparent order, although it might be tailored in a certain way for weight-two
and weight-three inputs as explained in Example 2 below.
Also important is that N be selected quite large and we
shall assume N 1000 hereafter. The importance of these
two requirements will be illuminated below. We point out
also that one pseudo-random permuter will perform about
as well as any other provided N is large.
C. The Puncturer
While for deep space applications low-rate codes are appropriate, in other situations such as satellite communications, a rate of 1/2 or higher is preferred. The role of
the turbo code puncturer is identical to that of its convolutional code counterpart, to periodically delete selected bits
to reduce coding overhead. For the case of iterative decoding to be discussed below, it is preferrable to delete only
parity bits as indicated in Fig. 1, but there is no guarantee
that this will maximize the minimum codeword distance.
For example, to achieve a rate of 1/2, one might delete all
1
11
15
15
15
even parity bits from the top encoder and all odd parity
bits from the bottom one.
D. The Turbo Encoder and Its Performance
As will be elaborated upon in the next section, a
maximum-likehood (ML) sequence decoder would be far
too complex for a turbo code due to the presence of the
permuter. However, the suboptimum iterative decoding algorithm to be described there oers near-ML performance.
Hence, we shall now estimate the performance of an ML
decoder (analysis of the iterative decoder is much more
dicult).
Armed with the above descriptions of the components of
the turbo encoder of Fig. 1, it is easy to conclude that it
is linear since its components are linear. The constituent
codes are certainly linear, and the permuter is linear since
it may be modeled by a permutation matrix. Further,
the puncturer does not aect linearity since all codewords
share the same puncture locations. As usual, the importance of linearity is that, in considering the performance
of a code, one may choose the all-zeros sequence as a reference. Thus, hereafter we shall assume that the all-zeros
codeword was transmitted.
Now consider
the all-zeros codeword (the 0thN codeword)
th
k 2 f1; 2; :::; 2 1g. The
and the k codeword, for some
ML decoder will choose thekpth codeword over the 0th codeword with probability Q 2dkrEb=N where r is the
code rate and dk is the weight of the kth codeword. The
bit error rate for this two-codeword situation would then
be
Pb(k j 0) = wk (bit errors/cw error)
1 data bits)
N(cw/
p
Q 2rdkEb=N (cw errors/cw)
r
!
w
2
rd
E
k
k
b
= N Q N (bit errors/data bit)
0
where wk is the weight of the kth data word. Now including

all of the codewords and invoking the usual union bounding
argument, we may write
Pb = Pb(choose any k 2 f1; 2;:::; 2N 1g j 0)
N
X
Pb(k j 0)
k
r
!
N
X wk
2
rd
E
k
b
Q
.
=
2
=1
k=1
N0
Note that every non-zero codeword is included in the above

summation. Let us now reorganize the summation as
(Nw) w r 2rdwvEb !
N X
X
Q
(1)
Pb
w=1 v=1 N
N0
where the rst sum

is over the weight-w inputs, the second
sum is over the Nw thdierent weight-w inputs, and dwv is
the weight of the v codeword produced by a weight-w
input.
Consider now the rst few terms in the outer summation
of (1).
w = 1: From Corollary 1 and associated discussion
above, weight-one inputs will produce only large weight
codewords at both constituent encoder outputs since the
trellis paths created never remerge with the all-zeros path.
Thus, each d v is signicantly greater than the minimum
codeword weight so that the w = 1 terms in (1) will be
negligible.
w = 2: Of the N weight-two encoder inputs, only a
fraction will be divisible by g (D) (i.e., yield remergent
paths) and, ofCCthese, only certain ones will yield the smallest weight, d ; ; at a constituent encoder output (here,
CC denotes \constituent code"). Further, with the permuter present,
if an input u(D) of weight-two yields a
weight-dCC; codeword at the rst encoder's output, it is
unlikely that the permuted input, u (D), seen byCC the second encoder will also correspond to a weight-d codeword (much less, be divisible by g (D)). We can; be sure,
however, that there will be some minimum-weight turbo
codeword produced by a w = 2 input, and that this minimum weight can be bounded as
dTC; 2dCC; 2 ;
with equality CCwhen both of the constituent encoders produce weight-d ; codewordsTC(minus 2 for the bottom encoder). The exact value of d ; is permuter dependent.
We will denote the
number of weight-two inputs which
produce weight-dTC; turbo codewords by n so that, for
w = 2, the inner sum in (1) may be approximated as
(N) 2 r 2rd v Eb ! 2n 0s 2rdTC; Eb 1
X
A : (2)
Q
' Q@
1
2 min
2 min
2 min
2 min
2 min
2 min
2 min
2 min
v=1 N
2 min
N0
N0
w = 3: Following an argument similar to the w = 2 case,

we can approximate the inner sum in (1) for w = 3 as
()3
N
3
X
v=1
NQ
2rd v Eb
3
N0
0s
' 3Nn3 Q @
rdTC
3;min Eb A
N0
(3)
where n and dTC; are obviously dened. While n is

clearly dependent on the interleaver, we can make some
comments on its size relative to n for a \randomly generated" interleaver. Although there are (N 2)=3 times as
many w = 3 terms in the inner summation of (1) as there
are w = 2 terms, we can expect the number of weight-three
terms divisible by g (D) to be of the order of the number
ofNweight-two
terms divisible by g (D): Thus, most of the
terms in (1) can be removed from consideration for

this reason. Moreover, given a weight-three encoder input
3
3 min
u(D) divisible by g1 (D) (e.g., g1(D) itself in the above example), it becomes very unlikely that the permuted input
u (D) seen by the second encoder will also be divisible by
g1(D). For example, suppose u(D) = g1 (D) = 1+ D + D4:
Then the permuter output will be a multiple of g (D) if
the three input 1's become the jth; (j +1)th; and (j1+4)th
bits out of the permuter, for some j . If we imagine that the
permuter acts in a purely random fashion so that the probability that one of the 1's lands
a given position is 1=N , the
permuter output will
be
Dj g1(D) = Dj 1 + D + D4 with
probabilility 3!=N 3 :1 For comparison, for w = 2 inputs,
a given
permuter output pattern occurs with probability
2!=N 2. Thus, we would expect the number of weight-three
inputs, n3 ; resulting in remergent paths in both encoders
to be much less than n2 ,
n3 << n2 ;
with the result being that the inner sum in (1) for w = 3
is negligible relative to that for w = 2.2
w 4: Again we can approximate the inner sum in (1)
for w = 4 in the same manner as in (2) and (3). Still
0
we would like to make some comments on its size for the

\random" interleaver. A weight-four input might appear
to the rst encoder as a weight-three input concatenated
some time later with a weight-one input, leading to a nonremergent path in the trellis and, hence, a negligible term
in the inner sum in (1). It might also appear as a concatenation of two weight-two inputs,
in which case the turbo
codeword weight is at least 2dTC; ; again leading to a negligible term in (1). Finally, if it happens to be some other
pattern divisible by g (D) at the rst encoder, with probability on the order of 1=N it will be simultaneously divisible by g (D) at the second encoder. Thus, we may
expect n << n so that the w 4 terms are negligible in
(1). The cases for w > 4 are argued similarly.
To summarize, the bound in (1) can be approximated as
8
0s
19
< wn
TC Eb =
2
rd
w;
w
A
(4)
Q@
Pb ' max
2 min
min
N0
where nw and dTC

w; are functions of the particular interleaver employed.
From our discussion above, we might
expect that the w = 2 term dominates for a randomly generated interleaver, although it is easy to nd interleavers
with n = 0 as seen in the example to follow. In any case,
min
This is not the only weight-three pattern divisible by g (D) |

= 1 + D + D& is another one, but this too has probability
of occurring.
Because our argument assumes a \purely random" permuter, the
inequality n! << n has to be interpreted probabilistically. Thus, it
is more accurate to write E fn! g << E fn g where the expectation
is over all interleavers. Alternatively, for the =LAH=CA interleaver, we
would expect n! << n ; thus if n = 5, say, we would expect n! = 0.
! The value of 1=N ! derives from that fact that ideally a particular
divisible output pattern occurs with probability 4!=N " , but there will
be approximately N shifted versions of that pattern, each divisible
by g (D).

g (D)
3!=N !
we observe that Pb decreases with N , so that the error rate

can be reduced simply by increasing the interleaver length.
This eect is called interleaver gain (or permuter gain) and
demonstrates the necessity of large permuters. Finally, we
note that recursive encoders are crucial elements of a turbo
code since, for non-recursive encoders, division by g (D)
(non-remergent trellis paths) would not be an issue and
(4) would not hold (although (1) still would).
Example 2. We consider the performance of a rate 1/2,
(31, 33) turbo code for two dierent interleavers of size
N = 1000. We start rst with an interleaver that was
randomly generated . We found forTCthis particular interleaver, n = 0 and n = 1, with d ; = 9, so that the
w = 3 term dominates in (4). Interestingly, the interleaver input corresponding to this dominant error event
was D (1 + D + D ) which produces the interleaver
output D (1 + D + D ), where of course both polynomials are divisible by g (D) = 1 + D + D . Figure 3
gives the simulated performance of of this code for 15 iterations of the iterative decoding algorithm detailed in the
next section. Also included in Fig. 3 is the estimate of (4)
for the same interleaver which is observed to be very close
to the simulated values. The interleaver was then modied
by hand to improve the weight spectrum ofTCthe code. It
was a simple matter
to attain n = 1 with d ; = 12 and
n = 4 with dTC; = 15 for this second interleaver so that
the w = 2 term now dominates in (4). The simulated and
estimated performance curves for this second interleaver
are also included in Fig. 3.
In addition to illustrating the use of the estimate (4),
this example helps explain the unusual shape of the error
rate curve: it may be interpreted as the usual Q-function
shape for a signaling scheme with a modest d ; \pushed
down" by the interleaver gain w nw =N , where w is the
maximizing value of w in (4).
1
168
3 min
10
88
15
848
2 min
3 min
III.
min
+1
The Decoder
Consider rst an ML decoder for a rate 1/2 convolutional

code (recursive or not), and assume a data word of length
N , N 1000 say. Ignoring the structure of the code, a
naive ML decoder would have to compare (correlate) 2N
code sequences to the noisy received sequence, choosing
in favor of the codeword with the best correlation metric. Clearly, the complexity of such an algorithm is exorbitant. Fortunately, as we know, such a brute force approach
is simplied greatly by Viterbi's algorithm which permits
a systematic elimination
of candidate code sequences (in
the rst step, 2N are eliminated, then another 2N are
eliminated on the second step, and so on). Unfortunately,
we have no such luck with turbo codes, for the presence
of the permuter immensely complicates the structure of a
turbo code trellis, making these codes look more like block
codes.
Just prior to the discovery of turbo codes, there was
much interest in the coding community in suboptimal decoding strategies for concatenated codes, involving mul
tiple (usually two) decoders operating cooperatively and

iteratively. Most of the focus was on a type of Viterbi decoder which provides soft-output (or reliability) information to a companion soft-output Viterbi decoder for use in
a subsequent decoding [10]. Also receiving some attention
was the symbol-by-symbol maximum a posteriori (MAP)
algorithm of Bahl, et al [11], published over 20 years ago.
It was this latter algorithm, often called the BCJR algorithm, that Berrou, et al [1], utilized in the iterative decoding of turbo codes. We will discuss in this section the
BCJR algorithm employed by each constituent decoder,
but we refer the reader to [11] for the derivations of some
of the results.
We rst discuss a modied version of the BCJR algorithm for performing symbol-by-symbol MAP decoding.
We then show how this algorithm is incorporated into an
iterativedecoder employing two BCJR-MAP decoders. We
shall require the following denitions:
- E1 is a notation for encoder 1
- E2 is a notation for encoder 2
- D1 is a notation for decoder 1
- D2 is a notation for decoder 2
- m is the constituent encoder memory
- S is the set of all 2m constituent encoder states
- xs = (xs ; xs ;:::;xsN ) = (u ; u ;:::;uN ) is the encoder
input word
- xp = (xp ;xp;:::;xpN ) is the parity word generated by
a constituent encoder
- yk = (yks ;ykp) is a noisy (AWGN) version of (xsk ;xpk)
- yab = (ya;ya ;:::;yb)
- y = yN = (y ;y ; :::; yN ) is the noisy received codeword
A. The Modied BCJR Algorithm
In the symbol-by-symbol MAP decoder, the decoder decides uk = +1 if P (uk = +1 j y) > P (uk = 1 j y), and it
decides uk = 1 otherwise. More succinctly, the decision
u^k is given by
u^k = sign [L(uk)]
where L(uk ) is the log a posteriori probability (LAPP)
ratio dened as
P
(
u
=
+1
y)
j
k
L(uk ) , log P (u = 1 j y) :
k
Incorporating the code's trellis, this may be written as
0P
p(sk = s ;sk = s; y)=p(y) 1
L(uk ) = log @ SP p(s = s ; s = s; y)=p(y) A (5)
k
k
S
+
where sk 2 S is the state of the encoder at time k, S is

the set of ordered pairs (s ;s) corresponding to all state
transitions (sk = s ) ! (sk = s) caused by data input
uk = +1, and S is similarly dened for uk = 1.
Observe we may cancel p(y) in (5) which means we
require only an algorithm for computing p(s ;s; y) =
p(sk = s ;sk = s; y): The BCJR algorithm [11] for doing
this is
(6)
p(s ; s; y) = k (s ) k (s ; s) k(s)
where k(s) , p(sk = s; yk ) is computed recursively as
X
(7)
k (s) = k (s )k (s ; s)
s S
with initial conditions
(0) = 1 and (s 6= 0) = 0 :
(8)
(These conditions state that the encoder is expected to
start in state 0.) The probability k (s ; s) in (7) is dened
as
(9)
k(s ;s) , p(sk = s;yk j sk = s )
and will beNdiscussed further below. The probabilities
k (s) , p(yk j sk = s) in (6) are computed in a \backward" recursion
as
X
(10)
k (s ) = k (s)k(s ; s)
s S
with boundary conditions
N (0) = 1 and N (s 6= 0) = 0 :
(11)
(The encoder is expected to end in state 0 after N input
bits, implying that the last m input bits, called termination
bits, are so selected.)
Unfortunately, cancelling the divisor p(y) in (5) leads
to a numerically unstable algorithm. We can include division by p(y)=p(yk) in the BCJR algorithm by dening
modied probabilities
~k(s) = k (s)=p(yk)
and
~k (s) = k (s)=p(ykN j yk ) :
Dividing (6) by p(y)=p(yk ) = p(yk )p(ykN j yk), we
obtain
p(s ;s j y)p(yk) = ~k (s ) k(s ;s) ~k(s) : (12)
Note since p(yk) = Ps S k (s); the values ~k (s) may be
computed from fk(s) : s 2 Sg via
X
(13)
~k(s) = k(s)= k (s) :
+
02
+1
+1
+1
s S
" Unfortunately, dividing by simply p(O ) to obtain p(s ; s j O ) also
leads to an unstable algorithm. Obtaining p(s ; s j O)p(yk ) instead
of the APP p(s ; s j O) presents no problem since an APP ratio is
computed so that the unwanted factor p(yk ) cancels; see equation
(16) below.
2
But since we would like to avoid storing both fk (s)g and

f~k (s)g, we can use (7) in (13) to obtain a recursion involving only f~k(s)g,
P
s k (s )k (s ;s)
~k (s) = P P
s s k (s )k (s ;s)
P
= Ps Ps s~k~k (s()s)k(ks(;ss ;s) ) ; (14)
where the second equality followsk by dividing the numerator and the denominator
by p(y ).
The recursion for ~k(s) can be obtained by noticing that
N
k
p(yNk j yk ) = p(yk) p(yp(kyk j y) )
N
k
= X X k (s )k (s ;s) p(yp(kyk j y) )
s s
XX
~k (s )k (s ;s) p(ykN j yk )
=
s s
so that dividing (10) by this equation yields
P ~
~k (s ) = P Ps ~k (s)(ks()s;s()s ;s) :
(15)
k
s s k
In summary, the modied BCJR-MAP algorithm involves computing the LAPP ratio L(uk ) by combining (5)
and (12) to obtain
0P
1
~k (s ) k(s ; s) ~k(s)
C
S
L(uk) = log B
@P
~k (s ) k (s ;s) ~k (s) A (16)
S
where the ~'s and ~'s are computed recursively via (14)
and (15), respectively. Clearly the f~k(s)g and f~k(s)g
share the same boundary conditions as their counterparts
as given in (8) and (11). Computation of the probabilities
k (s ; s) will be discussed shortly.
On the topic of stability, we should point out also that
the algorithm given here works in software, but a hardware implementation would employ the \log-MAP" algorithm [12], [14]. In fact, most software implementations
these days use the log-MAP, although they can be slower
than the algorithm presented here if not done carefully.
The algorithm presented here is close to the earlier turbo
decoding algorithms [1], [13], [4].
B. Iterative MAP Decoding
From Bayes' rule, the LAPP ratio for an arbitrary MAP
decoder can be written as
P (uk = +1)
+
log
L(uk) = log PP ((yy jj uukk == +1)
1)
P (uk = 1)
with the second term representing a priori information.
Since P (uk = +1) = P (uk = 1) typically, the a priori term is usually zero for conventional decoders. However, for iterative decoders, D1 receives extrinsic or soft
0
+1
+1
+1
information for each uk from D2 which serves as a priori information. Similarly, D2 receives extrinsic information from D1 and the decoding iteration proceeds as
D1!D2!D1!D2!..., with the previous decoder passing
soft information along to the next decoder at each halfiteration except for the rst. The idea behind extrinsic
information is that D2 provides soft information to D1 for
each uk , using only information not available to D1 (i.e.,
E2 parity); D1 does likewise for D2.
An iterative decoder using component BCJR-MAP decoders is shown in Fig. 4. Observe how permuters and
de-permuters are involved in arranging systematic, parity,
and extrinsic information in the proper sequence for each
decoder.
We now show how extrinsic information is extracted
from the modied-BCJR version of the LAPPratio embodied in (16). We rst observe that k (s ;s) may be written
as (cf. equation (9))
k(s ; s) = P (s j s )p(yk j s ;s)
= P (uk)p(yk j uk)
where the event uk corresponds to the event s ! s. Dening
P
(
u
=
+1)
k
e
L (uk ) , log P (u = 1) ;
k
observe that wemay write
Le (uk )=2] exp[u Le (u )=2]
P (uk) = 1exp[
k
k
+ exp[eLe(uk)]
= Ak exp[ukL (uk )=2]
(17)
where
the rst equality follows since it equals
p
!
P =P pP =P = P when u = +1 and
k
1 + P =P
0
P =P+
1 + P =P+
when uk = 1 :
where we have dened P , P (uk = +1) and P ,
As for p(y j u ); we may
P (uk = 1) for convenience.
write (recall yk = (yks ; ykp ) and xk = (xsk; xkpk ) = k(uk ;xpk))
p p
s
p(yk j uk) / exp (yk 2uk) (yk 2xk)
"
#
s + u + yp + xp
y
k
k
k
k
= exp
2
p
p
s
exp uk yk + xk yk
s
p xp
y
u
+
y
k
k
k
= Bk exp k
so that
s + xp yp
u
y
k
k
e
k k :
k(s ; s) / AkBk exp[ukL (uk )=2]exp
(18)
P =P+
Now since k(s ; s) appears in the numerator (where

uk = +1) and denominator (where uk = 1) of (16), the
factor Ak Bk will cancel as it is independent of uk . Also,
since we assume
transmission of the symbols 1 over the
channel, NEc= = so that = N =2Ec where Ec = rEb
is the energy per channel bit. From (18), we then have
1
1
p
p
e
s
k (s ;s) exp 2 uk (L (uk ) + Lcyk ) + 2 Lcykxk
= exp 21 uk (Le(uk ) + Lcyks ) ke (s ;s) (19)

where Lc , NEc and where
ke(s ; s) , exp 21 Lcykp xpk :

Combining (19) with (16) we obtain
1
0P
~k (s ) ke(s ; s) ~k(s) Ck
C
S
L(uk) = log B
@P
~k (s ) ke(s ;s) ~k (s) Ck A
S
s
(20)
= Lcyk +0LPe(uk)
1
~k (s ) ke(s ; s) ~k(s)
+log B@ SP ~k (s ) e(s ;s) ~k (s) CA :
0
0 2
where Ck , exp uk (Le(uk ) + Lcyks ) : The second equality follows since Ck (uk = +1) and Ck(uk = 1) can be
factored out of the summations in the numerator and denominator, respectively. The rst term in (20) is sometimes called the channel value, the second term represents
any a priori information about uk provided by a previous
decoder, and the third term represents extrinsic information that can be passed on to a subsequent decoder. Thus,
for example, on any given iteration, D1 computes
L (uk) = Lcyks + Le (uk) + Le (uk )
where Le (euk ) is extrinsic information passed from D2 to
D1, and L (uk) is the third term in (20) which is to be
used as extrinsic information from D1 to D2.
C. Pseudo-Code for the Iterative Decoder
We do not give pseudo-code for the encoder here since
this is much more straightforward. However, it must be
emphasized that at least E1 must be terminated correctly
to avoid serious degradation. That is, the last m bits of
the N -bit information wordth to be encoded must force E1
to the zero state by the N bit.
The pseudo-code given below for iterative decoding of
a turbo code follows directly from the development above.
Implicit is the fact that each decoder must have full knowledge of the trellis of the constituent encoders. For example,
each decoder must have a table (array) containing the input bits and parity bits for all possible state transitions
s ! s: Also required are permutation and de-permutation
1
2
21
12
21
12
functions (arrays) since D1 and D2 will be sharing reliability information about each uk, but D2's information is
pemuted relative to D1. We denote these arrays by P []
and Pinv[], respectively. For example, the permuted word
u is obtained from the original word u via the pseudo-code
statement: for k = 1 : N , uk = uP k , end. We next point
out that due to the presence of Lc in L(uk), knowledge of
the noise variance N =2 by each MAP decoder is necessary. Finally, we mention that a simple way to simulate
puncturing is, in the computation of (s ;s), to set to
zero the received parity samples,p ykporp ykkp; corresponding
to the punctured parity bits, xk or xk : Thus, puncturing
need not be performed at the encoder.
0
[ ]
- compute ~k (s) for all s using (14)

end
for k = N : 1 : 2
end
for k = 1 : N
- compute Le (uk ) using
1
0P
~k (s ) ke(s ;s) ~k (s)
C
S
Le (uk ) = log B
A
@P
e
~
~k (s ) k(s ; s) k (s)
S
(1)
(1)
12
(1)
12
(1)
===== Initialization =====

D1:
for s = 0
- ~ (s) == 01 for
s 6= 0
for s = 0
- ~N (s) == 01 for
s 6= 0
- Le (uk ) = 0 for k = 1; 2; :::; N
12
(1)
[ ]
[ ]
(2)
for s = 0
- ~ (s) == 01 for
s 6= 0
- ~N (s) = ~N (s) for all s (set after computation of
f~N (s)g in the rst iteration)
- Le (uk ) is to be determined from D1 after the rst
half-iteration and so need not be initialized
=======================
(2)
0
(2)
12
===== The nth interation =====

D1:
for k = 1 : N
- get yk = (yks ;ykp) where ykp is a noisy version of E1
parity
- compute k (s ;s) from (19) for all allowable state
transitions s ! s (uk in (19) is set to the value of
the encoder input which caused the transition s ! s;
Le(uk) is in this case Le (uPinv k ), the de-permuted
extrinsic informaton from the previous D2 iteration)
1
[ ]
Note encoder 2 cannot be simply terminated due to the presence

of the interleaver. The strategy implied here leads to a negligible
loss in performance. Other termination strategies can be found in
the literature (e.g., [15]).
#
(1)
D2:
21
21
(1)
(2)
end
D2:
for k = 1 : N
- get yk = (yPs k ; ykp)
- compute k (s ;s) from (19) for all allowable state
transitions s ! s (uk in (19) is set to the value of
the encoder input which caused the transition s ! s;
Le(uk) is Le (uP k ), the permuted extrinsic informaton from the previous D1s iteration; yks is the permuted systematic value, yP k )
end
for k = N : 1 : 2
end
for k = 1 : N
- compute Le (uk ) using
1
0P
~k (s ) ke(s ;s) ~k (s)
C
S
Le (uk ) = log B
A
@P
e
~
~k (s ) k(s ; s) k (s)
S
[ ]
(1)
0
(2)
(2)
21
(2)
21
(2)
1
1
(2)
(2)
end
==========================
===== After the last iteration =====
for k = 1 : N
- compute
L (uk) = Lcyks + Le (uPinv k ) + Le (uk )
- if L (uk) > 0
decide uk = +1
1
21
[ ]
12
else
decide uk = 1
end
==========================
Acknowledgment
The author would like to thank Omer Acikel of New

Mexico State University for help with Example 2, and Esko
Nieminen of Nokia and Prof. Steve Wilson of the University of Virginia for helpful suggestions.
References
[1] C. Berrou, A. Glavieux, and P. Thitimajshima, \Near

Shannon limit error- correcting coding and decoding:
Turbo codes," Proc. 1993 Int. Conf. Comm., pp.
1064- 1070.
[2] G. Ungerboeck, \Channel coding with multilevel/phase signals," IEEE Trans. Inf. Theory, pp.
55-67, Jan. 1982.
[3] M. Eyuboglu, G. D. Forney, P. Dong, G. Long,
\Advanced modulation techniques for V.Fast," Eur.
Trans. on Telecom., pp. 243-256, May 1993.
[4] P. Robertson, \Illuminating the structure of code and
decoder of parallel concatenated recursive systematic
(turbo) codes," Proc. GlobeCom 1994, pp. 1298-1303.
[5] S. Benedetto and G. Montorsi, \Unveiling turbo
codes: Some results on parallel concatenated coding
schemes," IEEE Trans. Inf. Theory, pp. 409-428,
Mar. 1996.
[6] S. Benedetto and G. Montorsi, \Design of parallel concatenated codes," IEEE Trans. Comm., pp. 591-600,
May 1996.
[7] J. Hagenauer, E. Oer, and L. Papke, \Iterative
decoding of binary block and convolutional codes,"
IEEE Trans. Inf. Theory, pp. 429-445, Mar. 1996.
[8] D. Arnold and G. Meyerhans, \The realization of
the turbo-coding system," Semester Project Report,
Swiss Fed. Inst. of Tech., Zurich, Switzerland, July,
1995.
[9] L. Perez, J. Seghers, and D. Costello, \A distance
spectrum interpretation of turbo codes," IEEE Trans.
Inf. Theory, pp. 1698-1709, Nov. 1996.
[10] J. Hagenauer and P. Hoeher, \A Viterbi algorithm
with soft-decision outputs and its applications," Proc.
GlobeCom 1989, pp. 1680-1686.
[11] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, \Optimal
decoding of linear codes for minimizing symbol error
rate," IEEE Trans. Inf. Theory, pp. 284-287, Mar.
1974.
[12] P. Robertson, E. Villebrun, and P. Hoeher, \A comparison of optimal and suboptimal MAP decoding algorithms operating in the log domain," Proc. 1995
Int. Conf. on Comm., pp. 1009-1013.
[13] C. Berrou and A. Glavieux, \Near optimum error
correcting coding and decoding: turbo-codes," IEEE
Trans. Comm., pp. 1261-1271, Oct. 1996.
[14] A. Viterbi, \An intuitive justication and a simplied

implementation of the MAP decoder for convolutional
codes," IEEE JSAC, pp. 260-264, Feb. 1998.
(31,33) Turbo Code Comparison
10
N = 1000
2
10
r = 1/2 (even parity punctured)

15 iterations
10
Pb
[15] D. Divsalar and F. Pollara, \Turbo codes for PCS

applications," Proc. 1995 Int. Conf. Comm., pp.
54-59.
10
10
bnd for intlvr1
10
bnd for intlvr2

7
10
10
0.5
1.5
2.5
Simulated performance of the rate 1/2 (31, 33)

turbo code for two dierent interleavers (N = 1000)
together with the asymptotic performance of each
predicted by (4).
u=xs
Eb/No
Fig. 3.
RSC 1
x1p
g2 ( D)
g1 ( D)
N-bit
Interleaver
Puncturing
x1p, x2p
Mechanism
RSC 2
u
N-bit
De-Intrlver
Le21
2p
g2 ( D)
g1 ( D)
Diagram of a standard turbo encoder with two

identical recursive systematic encoders (RSC's).
Fig. 1.
y1p
ys
e
L12
MAP
Decoder 1
N-bit
Intrlver
MAP
Decoder 2
N-bit
Intrlver
y2p
Fig. 4. Diagram of iterative (turbo) decoder which uses

two MAP decoders operating cooperatively. Le ise \soft"
or extrinsic information from D1 to D2, and L is
dened similarly. The nal decisions may come from
either D1 or D2.
12
21
uk
g11
uk
g12=0
g13=0
g14
g23
g24
+
g20
g21=0
g22
+
pk
Fig. 2.
Recursive systematic encoder for code generators

(g ;g ) = (31, 27).
1

A Turbo Code Tutorial

Uploaded by

Copyright:

Available Formats

A Turbo Code Tutorial

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Turbo Code Tutorial

Uploaded by

Copyright:

Available Formats

A Turbo Code Tutorial

Abstract | We give a tutorial exposition of turbo

Turbo codes, rst presented to the coding community in

design a computer simulation of the encoder and decoder.

The Encoder and Its Performance

Fig. 1 depicts a standard turbo encoder. As seen in the

Corollary 1. A weight-one input will produce an innite

where wk is the weight of the kth data word. Now including

Note that every non-zero codeword is included in the above

where the rst sum

w = 3: Following an argument similar to the w = 2 case,

where n and dTC; are obviously dened. While n is

terms in (1) can be removed from consideration for

we would like to make some comments on its size for the

where nw and dTC

This is not the only weight-three pattern divisible by g (D) |

we observe that Pb decreases with N , so that the error rate

Consider rst an ML decoder for a rate 1/2 convolutional

tiple (usually two) decoders operating cooperatively and

A. The Modied BCJR Algorithm

where sk 2 S is the state of the encoder at time k, S is

But since we would like to avoid storing both fk (s)g and

Now since k(s ; s) appears in the numerator (where

= exp 21 uk (Le(uk ) + Lcyks ) ke (s ;s) (19)

ke(s ; s) , exp 21 Lcykp xpk :

- compute ~k (s) for all s using (14)

===== Initialization =====

===== The nth interation =====

Note encoder 2 cannot be simply terminated due to the presence

The author would like to thank Omer Acikel of New

[1] C. Berrou, A. Glavieux, and P. Thitimajshima, \Near

[14] A. Viterbi, \An intuitive justication and a simplied

(31,33) Turbo Code Comparison

r = 1/2 (even parity punctured)

[15] D. Divsalar and F. Pollara, \Turbo codes for PCS

bnd for intlvr1

bnd for intlvr2

Simulated performance of the rate 1/2 (31, 33)

Diagram of a standard turbo encoder with two

Fig. 4. Diagram of iterative (turbo) decoder which uses

Recursive systematic encoder for code generators

You might also like

This is not the only weight-three pattern divisible by g (D) |