What is the trellis diagram for a linear block code?

This question is studied very extensively in Alex Vardy's chapter in Handbook of Coding Theory. I describe the idea below.

To get a trellis for a linear block code $C$ we need to define something like a time axis for the positions of the individual symbols of all the codewords. So basically an ordering of the symbols: this is the first symbol, this is the second,..., this is the last. Later on we may want to change the ordering, but let's fix one for now.

The trellis collects information about which beginnings of codewords can be matched with which endings in such a way that the end result will be a valid codeword. In the case of a linear code this translates into the following algebraic formulation.

Fix a point $i,0<i<n,$ on the time axis. Consider the so called initially trivial subspace $I_i(C)$ and finally trivial subspace $F_i(C)$ of $C$ defined by $$ I_i(C)=\{\vec{c}=(c_1,c_2,\ldots,c_n)\in C\mid c_j=0\ \text{for all}\ j\le i\}. $$ and $$ F_i(C)=\{\vec{c}=(c_1,c_2,\ldots,c_n)\in C\mid c_j=0\ \text{for all}\ j> i\}. $$ Let's extend these definitions by declaring that $F_0(C)=I_n(C)=\{0\}$ and $F_n(C)=I_0(C)=C$. The key observations are that:

  1. The sum of subspaces $I_i(C)+F_i(C)$ is direct, i.e. they intersect trivially.
  2. If $\vec{c}\in C$, $\vec{c}_I\in I_i(C)$, $\vec{c}_F\in F_i(C)$ are arbitrary, then the vector $\vec{c}+\vec{c}_I$ is a legal codeword that agrees with $\vec{c}$ up to and including the component $c_i$, and $\vec{c}+\vec{c}_F$ is a legal codeword that agrees with $\vec{c}$ from that point on, i.e. at positions $>i$. Furthermore, all the codewords with these properties are gotten in this way, because if two codewords agree up to and including (resp. from this point on), then their difference is in the initially trivial (resp. finally trivial) subcode.

So we can partition the code $C$ into cosets of the subspace $I_i(C)+F_i(C)$. Next we select a sequence of split points $0=i_0<i_1<i_2<\cdots<i_k=n$. Note that we usually will not use all the points as split points (that is $k<n$). With this data at hand we build a trellis for $C$ as follows. At split point $i_t$ we have the state-space $$V_t(C)=C/(I_{i_t}(C)\oplus F_{i_t}(C).$$ We observe $V_0(C)$ and $V_k(C)$ are both trivial 0-dimensional spaces. These will give us a unique initial and final state respectively.

The trellis diagram is a directed graph. It has a vertex for each ordered pair $(t,a)$, with $0\le t\le k$ and $a\in V_t(C).$ For each interval $t,t+1$, we define the subspace $T_t=I_{i_{t+1}}(C)\oplus F_{i_t}(C)$. We then define a set of edges connecting the vertex $(t,a)$ and the vertex $(t+1,a')$. These are in a bijective correspondence with cosets $\vec{c}+T_t$ such that $\vec{c}\in a\cap a'$. The number of such edges is thus $|a\cap a'|/|T_t|.$ The edges of the graph are labelled, and the above edge is labelled with the sequence of symbols $c_{i_t+1},c_{i_t+2},\ldots,c_{i_{t+1}}$. Observe that 1) if $a\cap a'=\emptyset$ there will be no edges from $(t,a)$ to $(t+1,a')$, 2) all the codewords in the coset $\vec{c}+T_t$ share the same sequence as labels.

What this amounts to is that for each codeword $\vec{c}\in C$ there is a unique path from $(0,V_0(C))$ to $(k,V_k(C))$ such that at each split point $i_t$ the path passes thru the vertex $(t,\vec{c}+I_{i_t}(C)\oplus F_{i_t}(C))$, and that it follows the edge associated with $\vec{c}+T_t$ between those two vertices. We can read the components of the codeword $\vec{c}$ by concatenating the labels of the edges along the path. Furthermore, it is not to hard to show that all the paths through the trellis represent valid codewords.

We can then decode using the Viterbi algorithm taking advantage of partial codewords shared by many words – exactly as in the case of convolutional codes.

The catch is that the number of either states or edges may easily be prohibitively large. For example, if systematic encoding is used with a $k$-dimensional code, then there are no shared sequences of $k$ first symbols among any codewords (because all those symbols are payload, and can be freely chosen by the user). This means that at such early split points the initially trivial subspaces are all trivial, and consequently the state space may be very large (up to $k$-dimensional, possibly smaller).

We can try and play the game of permuting the symbol positions (using non-systematic encoding). This is kind of a "black art", and it is not guaranteed to reduce the size of the trellis significantly. Therefore this is not a magic tool allowing us to efficiently decode any linear code with the Viterbi algorithm.

An example

The four section trellis of (16,11,4) extended Hamming code

Sorry about the residual garbage on the background. The figure shows a trellis of the extended binary linear Hamming code of length 16, rank 11 and minimum distance 4. The split points are at $i_0=0,i_1=4,i_2=8,i_3=12$ and $i_4=16$. The code is defined by the parity check matrix $$ H=\pmatrix{1111&1111&1111&1111\cr 1111&1111&0000&0000\cr 1111&0000&1111&0000\cr 1100&1100&1100&1100\cr 1010&1010&1010&1010\cr}. $$ As in all trellises the time axis runs from left to right. The vertices associated with a given split point are stacked on top of each other. Here all the edges are actually double edges, i.e. two parallel edges going from one vertex to another, the labels of the two parallel edges are always bitwise complements of each other. The finally trivial subcodes are spanned by $$ F_4=\langle 1111 0000 0000 0000\rangle, $$ $$ F_8=F_4\oplus \langle 0000 1111 0000 0000, 0011 0011 0000 0000, 0101 0101 0000 0000\rangle, $$ $$ F_{12}=F_8\oplus\langle 0000 0000 1111 0000, 1100 1010 0110 0000, 1010 0110 1100 0000\rangle, $$ and have respective dimensions 1,4,7. The initially trivial subcodes are $$ I_{12}=\langle 0000 0000 0000 1111\rangle, $$ $$ I_8=I_{12}\oplus\langle 0000 0000 1111 0000, 0000 0000 1100 1100, 0000 0000 1010 1010\rangle, $$ $$ I_4=I_8\oplus\langle 0000 1111 0000 0000, 0000 0110 0101 0011, 0000 0011 0110 0101\rangle, $$ with complementary dimension such that $\dim I_i\oplus F_i=8$ for all $i\in\{4,8,12\}$. The words of this code are of the form $(a|b|c|d|)$, where each letter stands for a vector in $\mathbb{F}_2^4$, i.e. a group of 4 bits. The constraints are: $a+b+c+d$ must be either $0000$ or $1111$. The parities of the weights of the groups are either all odd or all even.