What is a cumulant really?
Cumulants have many other names depending on the context (statistics, quantum field theory, statistical mechanics,...): seminvariants, truncated correlation functions, connected correlation functions, Ursell functions... I would say that the $n$-th cumulant $\langle X_1,\ldots,X_n\rangle^{T}$ of random variables $X_1,\ldots,X_n$ measures the interaction of the variables which is genuinely of $n$-body type. By interaction I mean the opposite of independence. Denoting the expectation by $\langle\cdot\rangle$ as in statistical mechanics, independence implies the factorization $$ \langle X_1\cdots X_n\rangle=\langle X_1\rangle\cdots\langle X_n\rangle\ . $$ If the variables are Gaussian and centered then for instance $$ \langle X_1 X_2 X_3 X_4\rangle=\langle X_1 X_2\rangle\langle X_3 X_4\rangle +\langle X_1 X_3\rangle\langle X_2 X_4\rangle +\langle X_1 X_4\rangle\langle X_2 X_3\rangle $$ so the lack of factorization is due to $2$-body interactions: namely the absence of factorization for $\langle X_i X_j\rangle$. The $4$-th cumulant for variables with vanishing moments of odd order would be the difference $LHS-RHS$ for the previous equation. Thus it would measure the "interaction" between the four variables which is due to their conspiring all together instead of being a consequence of conspiring in groups of two at a time. For higher cumulants, the idea is the same.
Cumulant are definitely related to connectedness. For instance for variables whose joint probability density is a multiple of a Gaussian by a factor $\exp(-V)$ where $V$ is quartic, one can at least formally write moments as a sum of Feynman diagrams. Cumulants are given by similar sums with the additional requirement that these diagrams or graphs must be connected.
Some references:
Chapters 1 and 4 of the book "Path Integrals in Quantum Mechanics" by Zinn-Justin.
My article "Feynman diagrams in algebraic combinatorics". Although it applies to a somewhat different context, it explains the combinatorics of Feynman diagrams in terms of Joyal's theory of species.
The review "Feynman diagrams for pedestrians and mathematicians" by Polyak. However the discussion (e.g. in Section 4.5) of the issue of convergence does not give an accurate idea of this research area.
It might help to take a broader perspective: in some contexts (notably quantum optics) the emphasis is not on cumulants but on factorial cumulants, with generating function $h(t)=\log E(t^X)$. While cumulants tell you how close a distribution is to a normal distribution, the factorial cumulants tell you how close it is to a Poisson distribution (since factorial cumulants of order two and higher vanish for a Poisson process).
So I would think that any privileged role of cumulants is linked to the prevalence of normal distributions.
A nice question, with probably many possible answers. I'll give it a shot. I think three phenomena should be noted.
i) The cumulant function is the Laplace transform of the probability distribution. Uniqueness of Laplace transforms then tells you that the cumulant function can be used to fully characterize your probability distribution (and in particular, its properties like its connectivity or cohesion, whatever these might be). Since a probability distribution is essentially a measure, and it is often more convenient working with functions, the Laplace transform is useful. As an example, all moments may be computed from the cumulant function, and probability distributions for which the moments coincide are the same (under some extra conditions). The idea of transforming a probability distribution into a function is also exemplified by the Fourier transform of a probability distribution, i.e. the characteristic function $u \mapsto \mathbb E[e^{i u X} ]$, with $u \in \mathbb R$. For this transform there is the well known result that pointwise convergence of characteristic functions is equivalent to weak convergence (narrow convergence from analysis point of view) of corresponding probability measures. See [Williams, Probability with Martingales].
ii) Sums of independent random variables. Their probability distributions are given by convolutions, and thus hard to work with. In the Laplace/Fourier domain, this difficulty disappears.
iii) The soft-max principle. This idea plays a key role in large deviations theory. Note that $\frac 1 t \log \mathbb E[e^{t X}] \rightarrow \mathrm{ess} \sup X$ as $t \rightarrow \infty$. Related terminology is the 'Laplace approximation of an integral' in physics (see here). Extensions of this idea, combined with a little amount of convex optimization theory (in particular Legendre-Frenchel transforms), allow one to deduce estimates on the distribution of sums of (not necessarily independent) random variables. Consult e.g. the Gärtner-Ellis theorem in any textbook on large deviations theory (recommended are [Varadhan], [den Hollander] or [Dembo and Zeitouni]), or here. Again, this explains mostly why the cumulant is useful, but not really what it is.
The somewhat disappointing summary is that it seems from the above observations that the (log) cumulant function is mostly a technical device. But a very useful one.
Hopefully somebody else has a suggestion on how the cumulant function may be given a more intuitive meaning, perhaps even related to your suggestion of the cumulant function measuring cohesion of probability measures. I would certainly be interested in such an explanation.