Connectivity of the Erdős–Rényi random graph
A nice question. Here's a strategy that occurs to me, though it could fail miserably.
The basic problem seems to be what you said about variance: the appearances of different spanning trees are far from independent, since it is possible to make local modifications to a spanning tree and get another one. (For example, if x is a leaf joined to y, which is joined only to z, then we can replace the path zyx by the path zxy.)
One way we might try to defeat this is to choose a random set $\Sigma$ of spanning trees, where each spanning tree is chosen independently with probability $\alpha^{n-1}$ for some carefully chosen $\alpha$ (which I imagine as a small negative power of $n$). Then the expected number of trees from $\Sigma$ in a $p$-random graph is $(\alpha p)^{n-1}n^{n-2}$, which is pretty large even when $p$ is pretty close to $n^{-1}$. But now we might expect that any two trees in $\Sigma$ are quite well-separated, so perhaps it is possible to get a decent estimate for the variance.
Actually, it's not clear to me what passing to the random set really achieves here: maybe a simpler method (but not wholly simple) is to work out the expected number of pairs of spanning trees by carefully classifying what they can look like. The hope would be that if you pick one tree at random, then the proportion of trees that overlap with it to any great extent is usually so small that the expected number of pairs is not significantly bigger than the square of the expected number of spanning trees. With $p=n^{-1+\epsilon}$ something like this might work, but you've probably already thought about this.
Here is an easy proof for $C>2$, and a fairly easy proof for $C>1$.
Define a cut of a graph $G$ to be a partition of the vertices of $G$ into two sets which are crossed by no edges. So a graph has a nontrivial cut if and only if it is disconnected. We will show that the expected number of cuts of $G$ goes to $0$, so that, with probability $1$, our graph is connected.
For any particular partition of the vertices of $G$ into two sets, of size $k$ and $n-k$, the probability that this partition is a cut is $(1-p)^{k(n-k)} \binom{n}{k}$. So the expected number of cuts is $$\sum_{k=1}^{n/2} (1-p)^{k(n-k)} \binom{n}{k}.$$ We only have to go up to half way, because we can always take $k$ to be the smaller half of the cut. (I'm going to be sloppy and write non-integer bounds for my summations, as I've done here. You can fix it, if you like.)
For $C>2$, we have the following crude bound $$\sum_{k=1}^{n/2} (1-p)^{k(n-k)} \binom{n}{k} \leq \sum_{k=1}^{n/2} e^{-p k(n-k)} n^k.$$
The $\log$ of the summand is $$-k (n-k) \frac{C \log n}{n} + k \log n = k \log n \left( 1-C(1-k/n) \right) \leq k \log n (1-C/2)$$
So, if $C>2$, we are bounded by $\sum_{k=1}^{n/2} e^{(1-C/2) k \log n}$. This is a geometric series, whose sum is easily seen to be bounded by a constant multiple of its leading term; namely $n^{(1-C/2)}$. So the sum goes to $0$ and we are done.
Now, what if $C>1$, but not as large as $2$? Let $a$ be a real number such that $1-C(1-a) < 0$. The preceeding argument shows that the contribution of the terms with $k<an$ is negligible. (If $C\leq 1$, there is no such number and this proof breaks.)
We now consider the remaining terms, and use a different crude bound: $$\sum_{k=an}^{n/2} (1-p)^{k(n-k)} \binom{n}{k} \leq \sum_{k=an}^{n/2} e^{-pk(n-k)} 2^n$$
Again, the log of the summand is $$-k(n-k) \frac{C \log n}{n} + n \log 2 = -(k/n) (1-k/n) C n \log n + n \log 2 \leq - a (1-a) C n \log n + n \log 2$$ This is the log of an individual summand; we have to add up $(1/2 -a) n < n$ of them. So the sum is bounded by $$n e^{-a(1-a) C n \log n + n \log 2} =e^{-a(1-a) C n \log n + n (\log 2+1)}.$$
The $n \log n$ overwhelms the $n$, so we are done.
Dear Mathew, this is not really an answer to your question but just a related matter. As you point out the expected number of trees in a random graph is 1 already when p=c/n and is very large when p=logn/n so the hope is that this can be used to show that with a large probability the random graph contains a tree. There is a collection of conjectured by Jeff Kahn and me trying to suggest a very general connection of this type. These conjectures are presented in the pape Thresholds and expectation thresholds by Kahn and me and are mentioned in this MO question. If true these conjectures will imply that the threshold for connectivity will be below logn/n (of course, we do not need it for this case...), and the proof will probably will at best be much much more complicated then existing proofs.
I should mention that the sharp threshold property which was proved by Erdos and Renyi for connectivity can be proved (with harder proofs) from more general principles: One is the Margulis-Talagrand theorem which applies to the threshold for random subgraphs of highly edge connected graphs and one is Friedgut's result which identify graph properties with coarse thresholds.