Find the probability that a word with 15 letters (selected from P,T,I,N) does not contain TINT

EDIT: I believe this answer is a slight simplification of the first answer, where we use the fact that we don't need to keep track of words that do contain TINT when we're trying to see how an $n+1$-letter word not containing TINT can be obtained from an $n$-letter one.

Let $a_n$ be the number of $n$-letter words not containing TINT and ending with TIN. Let $b_n$ be the number of $n$-letter words not containing TINT and ending with TI. And let $c_n$ be the number of $n$-letter words not containing TINT ending with T. Finally let $d_n$ be the number of $n$-letter words not containing TINT that don't fall in any of the above categories. Then we have \begin{align*} a_n = b_{n-1}, \quad b_n = c_{n-1}, \quad c_n = b_{n-1} + c_{n-1} + d_{n-1} \end{align*} and \begin{align*} d_n = 3a_{n-1} + 2b_{n-1} + 2c_{n-1} + 3d_{n-1} \end{align*} So if we have the column vector $(a_{n-1},b_{n-1},c_{n-1},d_{n-1})^T$, the column vector of the new guys $(a_n,b_n,c_n,d_n)^T$ is obtained by multiplying on the left by the matrix \begin{align*} A= \begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 1 & 1 \\ 3 & 2 & 2 & 3 \end{pmatrix} \end{align*} For $n=1$, the vector is $v=(0,0,1,3)^T$. So we need to compute the sum of the entries of $A^{14}v$, which gives us the answer others have already given in the comments.


Let's name our alphabet $A := \{P, T, I, N\}$.

Let's look at the set $L := A^*$ of all possible words using the letters $P, T, I, N$. First, let's dissect it into five sets:

$$L_5 := \{w \in L : TINT\text{ is a substring of }w\},$$ $$L_4 := \{w \in L \smallsetminus L_5 : w\text{ ends with } TIN\},$$ $$L_3 := \{w \in L \smallsetminus L_5 : w\text{ ends with } TI\},$$ $$L_2 := \{w \in L \smallsetminus L_5 : w\text{ ends with } T\},$$ $$L_1 := L \setminus (L_1 \cup L_2 \cup L_3 \cup L_4).$$

Example: The word $PTTPTIN$ would be in $L_4$.

The nice thing about our five sets $L_1, \ldots, L_5$ is that for any $a \in A$, $w \in L_i$, the class $L_j \ni wa$ is completely determined by $i$ and $a$. For example: If $w \in L_3$, then $wN$ is always in $L_4$.

We give each state a corresponding state vector $$s_1=\begin{pmatrix}1\\ 0 \\0 \\0 \\0\end{pmatrix}, s_2=\begin{pmatrix}0\\ 1 \\0 \\0 \\0\end{pmatrix}, s_3=\begin{pmatrix}0\\ 0 \\1 \\0 \\0\end{pmatrix}, s_4=\begin{pmatrix}0\\ 0 \\0 \\1 \\0\end{pmatrix}, s_5=\begin{pmatrix}0\\ 0 \\0 \\0 \\1\end{pmatrix}$$

and denote by $s(w)$ the state vector of the state $w$ is in. For example $s(TPTI) = s_3$.

Suppose now we have a random word $W$ over $L$. We can describe the expected state of $W$ by $S(W) := \mathbb{E}(s(W))$, which is a convex combination of the state vectors, a vector whose $i$-th entry denotes the probability of $W$ being in $L_i$. Now suppose $l$ is a random letter independent from $W$, uniformly chosen from $A$.

Then $S(Wl) = B \cdot S(W)$, where

$$B ~:=~ \frac{1}{4}\begin{pmatrix}3 & 2 & 2 & 3 & \\ 1 & 1 & 1 & & \\ & 1 & & & \\ & & 1 & & \\ & & & 1 & 4\end{pmatrix}$$

Suppose you have $w \in L_3$, then $wl$ is in $L_2$ with probability $\frac{1}{4}$ (in case $l = T$), in $L_4$ with probability $\frac{1}{4}$ (in case $l = N$) and in $L_1$ with probability $\frac{1}{2}$ (in case $l = I$ or $l = P$). This corresponds to the third column of $B$. The other columns are constructed likewise.

What we have done here is constructed the transition matrix of a Markov chain with $5$ states. Each entry $b_{ji}$ of $B$ corresponds to the probability of going from state $L_i$ to state $L_j$ after adding a random letter.

Now consider what happens when we add 15 random letters $l_1, \ldots, l_{15}$ to the empty word $\epsilon$. We know that $S(\epsilon) = s_1$, and now we know that $S(\epsilon l_1\cdots l_{15}) = B^{15}S(\epsilon) = B^{15}s_1$. Now the probability that $\epsilon l_1\cdots l_{15}$ contains $TINT$ is exactly the fifth entry of $S(\epsilon l_1\cdots l_{15})$, which is $s_5^\top B^{15}s_1$, which is the bottom left entry of $B^{15}$. The bottom left entry corresponds to the probability of going from state $L_1$ to state $L_5$ after adding 15 random letters.

The probability that a randomly chosen 15-letter word $W \in A^{15}$ contains the substring $TINT$ is exactly the bottom left entry of $B^{15}$.

Wolfram Alpha says this number is exactly

$$\frac{49167017}{1073741824}.$$

The probability that $W$ does not contain $TINT$ is therefore

$$ 1 - \frac{49167017}{1073741824} = \frac{1024574807}{1073741824}.$$


The following Python script confirms the answer given by @McFry:

LENGTH = 15
LETTERS = 'PTIN'
SEQUENCE = 'TINT'

def func(word):
    if len(word) < LENGTH:
        return sum(func(word+letter) for letter in LETTERS)
    else:
        return SEQUENCE not in word

count = func('')
total = len(LETTERS)**LENGTH
print '{}/{}={}%'.format(count,total,100.0*count/total)

The output is $1024574807/1073741824\approx95.42\%$.