Is there a Faa di Bruno-like formula for composition of three functions?

The Faa di Bruno formula can be well reinterpreted in terms of trees, and then the generalization is obvious. Here's how:

First of all, the $k$th derivative of $f\circ g \circ h$ depends only on the first $k$ derivatives of $f,g,h$, evaluated at the appropriate spots. So we can without loss of generality identify smooth functions with their power series --- these should be appropriately centered, so we might as well assume that $f(0) = g(0) = h(0) = 0$. So the data of the "function" $f$ is the sequence $f^{(1)},f^{(2)},f^{(3)},\dots$ of derivatives at $0$. (Any combinatorial fact about derivatives can be checked by composing polynomials.)

Second, rather than trying to guess formulas by thinking only in terms of functions of a single variable, moving to vector land. So $f,g,h$ are functions (rather, formal power series) between different vector spaces. For the Faa di Bruno formula in this generality, see M. Hardy, "Combinatorics of Partial Derivatives", the electronic journal of combinatorics, vol. 13 (R1), 2006.

Third, as soon as you're working in vectors, you realize that if you thought $f$ was a smooth vector-valued function $V \to W$, then $f^{(n)}$ is a linear function $V^{\otimes n} \to W$. And then it's natural to draw it in the Penrose notation (R. Penrose, "Applications of negative dimensional tensors". In D.J.A. Welsh, editor, Combinatorial mathematics and its applications, pages 221–244, London, 1971. Mathematical Institute, Oxford, Academic Press) as a directed tree. You should label the $n$ incoming edges with the vector space $V$, the one outgoing edge with $W$, and the vertex with $f$ --- you don't need to label it with $f^{(n)}$ because the $n$ is counted by the number of incoming edges.

Fourth, work a la Feynman. Declare that a vertex with $n$ incoming and one outgoing edge is $f^{(n)}$. But when you evaluate it, you should divide by a symmetry factor, which counts the number of automorphisms of the diagram ($n!$). Put an $x$ label on each incoming edge. Then: $$ f(x) = \sum_\Gamma \frac{\Gamma}{|{\rm Aut} \Gamma|} $$ where the sum ranges over all diagrams $\Gamma$ that can be drawn with a single vertex, which is labeled by $f$:

            x        x   x      x  x  x
            |        |   |      |  |  |
            |        |   |       \ | /
        1   |      1  \ /     1   \|/
f(x) =  -   * f  + -   * f  + -    * f  + ...
        1   |      2   |      6    |
            |          |           |
            |          |           |

Ok, then the Faa di Bruno formula says the following:

$$ f(g(x)) = \sum_\Gamma \frac{\Gamma}{|{\rm Aut} \Gamma|} $$ where the sum ranges over all diagrams with two "rows": an $f$ on the bottom, and some $g$s on the top:

              x       x   x       x   x
              |       |   |       |   |
              |        \ /        |   |
           1  * g    1  * g    1  *   *
f(g(x)) =  -  |    + -  |    + -  |   |  +
           1  |      2  |      2   \ /
              * f       * f         * f
              |         |           |

             x x x     x   x   x       x  x  x
             | | |     |   |   |       |  |  |
              \|/       \ /   /        |  |  |
            1  * g    1  *g  *g     1  *g *g *g
          + -  |    + -  |   |   +  -   \ | /   + 
            6  |      2   \ /       6    \|/
               * f         *f             *f
               |           |              |

             x x x x     x x x   x    x x   x x    x x  x  x         x  x  x  x
             \ | | /     | | |   |    | |   | |    | |  |  |         |  |  |  |
              \\ //      | | |   |    | |   | |    | |  |  |         |  |  |  |   
               \V/        \|/   /      \|   |/      \|  |  |         |  |  |  |
             1  *g      1  *g  *g    1  *g  *g    1  *g *g *g     1  *g *g *g *g
          + --  |    +  -  |   |   + -  |   |   + -   \ | /    + --   \  \ /  /  +
            24  |       6   \ /      8   \ /      4    \|/       24     \ V /
                *f           *f           *f            *f                *f
                |            |            |             |                 |

          + ...

Here I've sorted the diagrams into the degree in $x$. The numerical factor counts the number of combinatorial automorphisms of each diagram (see e.g. the 1/8 and the 1/4), and I list every diagram once up to isomorphism. Which is to say that I'm really considering the whole groupoid of diagrams, and computing a "groupoid integral" in the sense of Baez and Dolan.

Anyway, the composition of three functions can again be understood as counting labeled graphs with symmetry.

                 x       x   x      x   x       x   x
                 |        \ /       |   |       |   |
                 *h        *h       *h  *h      *h  *h
              1  |      1  |      1  \ /      1 |   |
f(g(h(x))) =  -  *g  +  -  *g  +  -   *g   +  - *g  *g  + ...
              1  |      2  |      2   |       2  \ /
                 *f        *f         *f          *f
                 |         |          |           |

I should mention one more thing. The above sums of diagrams compute the whole power series, and so they're computing the $n$th derivatives of the composition divided by $n!$. For example, $$ \frac1{4!} (f\circ g)^{(4)} = \frac1{24} f^{(1)}g^{(4)} + \frac16 f^{(2)} g^{(3)} g^{(1)} + \frac18 f^{(2)} (g^{(2)})^2 + \frac14 f^{(3)} g^{(2)} (f^{(1)})^2 + \frac1{24} f^{(4)} (g^{(1)})^4 $$ If you want to evaluate the honest derivative, that's fine too: label the $x$s so that you can distinguish between them, and demand that isomorphisms of diagrams preserve the labellings. Then no diagrams have nontrivial automorphisms (the automorphism group of any diagram is a subgroup of $S_n$ acting on the $n$ $x$s), and it's still true that you should think in terms of the sum over all isomoprhism classes of diagrams.


I would like to provide an analytically explicit answer. For simplicity, just take f,g,h smooth functions from $\mathbb R$ into itself. Then we have \begin{align*} \frac{(h\circ g\circ f)^{(p)}}{p!}=\sum_{\substack{p_1+\dots+p_q=p\\q\ge 1, p_j\ge 1}} \frac{(h\circ g)^{(q)}}{q!} \prod_{1\le j\le q}\frac{f^{(p_j)}}{p_j!} \\ = \sum_{\substack{p_1+\dots+p_q=p\\q\ge 1, p_j\ge 1}} \sum_{\substack{q_1+\dots+q_r=q\\r\ge 1, q_k\ge 1}} \frac{h^{(r)}}{r!} \prod_{1\le k\le r}\frac{g^{(q_k)}}{q_k!} \prod_{1\le j\le q}\frac{f^{(p_j)}}{p_j!}, \end{align*} where of course the derivatives of $h$ (resp. $g$) are evaluated at $(g\circ f)(x)$ (resp. $f(x)$). Alphabetical order may help the reader to complete an expression for the higher order derivatives of the composition of $N$ functions.


Similarly to Theo's answer, I think the best way to write such combinatorial formulas is in terms of trees (rather than multiindices or even set partitions). I explained that in my article "Feynman diagrams in algebraic combinatorics" in Sém Lothar. Combin. 49 (2002/04), Art. B49c. See in particular Eq. 13 for composition of $n$ functions.