Could someone explain rough path theory? More specifically, what is the higher ordered "area process" and what information is it giving us?
I'm not an expert on this, but I did some research and found this Topics in Gaussian rough paths theory. I suggest you read it from the begining, pay special attention on page 5 and 6.
From all the books and papers I found on Google this paper gives the most comprehensible explanation why you need those iterated integrals. Thy naturaly occur when you construct the solution to SDE by Picard's method. Also they provide necessary extra info which enables you do certain stohastic integrals.
I will cite some things from the above paper to back up this:
The key insight of Lyons is that the path alone does not provide enough information in order to build up a satisfactory integration theory, but the path together with some extra information which compensates its roughness does. That this extra information is indeed encoded in the iterated integrals (which, we repeat, have to be defined since they are not intrinsically given in the path) can be seen, for example, from numerical considerations
This is on page 5.
He begins with the definition of iterated integrals in an abstract setting. Of course, they cannot be just the limit of Riemann sums, but are dened to be objects which behave like iterated integrals in an algebraic and analytic way. One could say that these objects mimic the iterated integrals of a path x
That abstract setting is what makes it difficult to understand from Lyons work. Here is the book by the way Book
There are other papers which give other interpretations e.g.
Rough path analysis : An Introduction
The iterated integral X naturally appears when we approximate the integral I using Taylor’s expansion
Rough path theory
Look at page 7. Thhere is also a note on Chen's relation
Such objects are defined by algebraic axioms called Chen (or multiplicative) and shuffle (or geometric) properties, which always hold true for the iterated integrals of a smooth path
Differential Equations Driven by Rough Paths Front matter
The iterated integrals of X over an interval [s, t] are thus extremely efficient statistics of X, in the sense that they determine very accurately the response of any linear system driven by X Page XII, Introduction, also note the Picard iteration just above
System Control and Rough Paths
The basic viewpoint of this book lies in the interpretation of the differential dX. The fundamental idea is that the ‘full differential’ dX is the collection of all iterated path integrals, namely...
DIFFERENTIAL EQUATIONS DRIVEN BY ROUGH PATHS: AN APPROACH VIA DISCRETE APPROXIMATION
His approach is to suppose these latter integrals are given, subject to natural consistency conditions, and then to develop an integration theory which suffices to treat the differential equation
Concerning the Chen's relation, from Lyons books you'll se it arises naturally with introduction of Lie algebra for those iterated integrals.
Here is also something concernig the computational aspect of the theory:
A Distributed Procedure for Computing Stochastic Expansions with Mathematica
And mathematica programs to play with: Christophe Ladroue
I suggest to start studying this from Lyons books. Also refresing the knowledge of the basics from Stohastic analysis would be great. It would also be great if Mr. Lyons answered this question :)
I know this is not worth the bounty but you got me interested.
UPDATE:
Here is an exposition by Lyons, explaining the iterated integral area process (p. 118): The interpretation and solution of ODE driven by rough signals
UPDATE2:
Here is a great book to see how iterated integrals emerge from Taylor expansions of ODE solution Taylor Approximations. Note that most of these authors examine linear ODE with constant coefficients (that operator $L$ is a linear op.) so it is easy to get iterated integrals. Look at Lyons book Book page 13. in PDF. As for linear ODE with variable coeff. look at Christophe Ladroue's paper I posted above. In that case you use integration by parts to compute powers etc. of iterated integrals. page 3. eq. 8.
Consider an equation like \begin{equation}\tag{1}dY_t = f(Y_t) dW_t\end{equation} where $Y_t$ is an unknown function and $W_t$ is a continuous, but not differentiable, function.
If $W_t$ is Brownian motion then there is a classical theory for how to understand (1). Brownian motion has finite $p$-variation only for certain values of $p$. As you may know, Brownian motion on the interval $[0,T]$ has quadratic variation equal to $T$.
What if $W_t$ is not Brownian motion, but just some function with finite $p$-variation for some other $p$? Then the approach is to just fix a path of $W_t$ (so it's not a random process anymore, just a single non-differentiable function). Then Lyons shows how to make sense of (1) in terms of $$\tag{2}\int W_s dW_s$$ We don't know what (2) means, so we write down some axioms for what (2) should be like (those are Chen's relations I guess), and use that as a starting point for the theory.
(This is based on my reading of the first page of http://www.maths.ed.ac.uk/~adavie/rpathrev.pdf)