Sylow Subgroups
Victor, you should check out Sylow's paper. It's in Math. Annalen 5 (1872), 584--594. I am looking at it as I write this. He states Cauchy's theorem in the first sentence and then says "This important theorem is contained in another more general theorem: if the order is divisible by a prime power then the group contains a subgroup of that size." (In particular, notice Sylow's literal first theorem is more general than the traditional formulation.) Thus he was perhaps in part inspired by knowledge of Cauchy's theorem.
Sylow also includes in his paper a theorem of Mathieu on transitive groups acting on sets of prime-power order (see p. 590), which is given a new proof by the work in this paper. Theorems like Mathieu's may have led him to investigate subgroups of prime-power order in a general finite group (of substitutions).
The Sylow theorems are finite group analogues of a bunch of results about "maximal unipotent subgroups" in algebraic groups. Basically, the Sylow subgroups play a role analogous to the role played by the maximal unipotent subgroups.
In the case where the group is the general linear group, the maximal unipotent subgroup can be taken as the group of upper triangular matrices with 1s on the diagonal, for instance. There are existence, conjugacy, and domination results for these analogous to the existence, conjugacy, and domination part of Sylow's theorems: maximal unipotents exist, every unipotent is contained in a maximal unipotent, all maximal unipotents are conjugate. The role analogous to "order" is now played by "dimension".
The normalizer of the Sylow subgroup plays the role of the maximal connected solvable subgroup, also called the Borel subgroup (see Borel fixed-point theorem and Lie-Kolchin theorem). In the case of the general linear group, this is the group of upper triangular invertible matrices.
There are similar results for Lie algebras too, basically arising from Engel's theorem and Lie's theorem.
In fact, much of the study of simple groups and their geometry relies on this geometric interpretation of Sylow subgroups, p-subgroups, and their normalizers. This deeper study of the geometry/combinatorics of simple groups is called local analysis in group theory and is closely related to the recently popular topic of "fusion systems" which are essentially studying the conjugation action of a group on subgroups of a particular Sylow subgroup.
ADDED BASED ON COMMENT BELOW: For a finite field $F_q$ where q is a power of p, the maximal unipotent subgroup of $GL_n(F_q)$ is the $p$-Sylow subgroup. I had originally intended to mention this, but forgot.
An extension of the Vipul's ideas can be found in the article (couldn't find a link to the pdf with google)
Subgroup complexes by Peter Webb, pp. 349-365 in: ed. P. Fong, The Arcata Conference on Representations of Finite Groups, AMS Proceedings of Symposia in Pure Mathematics 47 (1987).
But as Mariano already commented, the analogy to the maximal unipotent subgroups of the general linear group was probably not Sylow's motivation. As commented before, he was maybe looking for maximal $p$-subgroups (i.e., maximal with respect to be a $p$-subgroup).
This is also the leitmotif of my favorite proof of the Sylow theorems given by Michael Aschbacher in his book Finite Group Theory. It is based on Cauchy’s theorem (best proved using J. H. McKay’s trick to let $Z_p$ act on the set of all $(x_1, \dots, x_p) \in G^p$ whose product is $1$ by rotating the entries) and goes essentially like this:
The group $G$ acts on the set $\mathrm{Syl}_p(G)$ of its maximal $p$-subgroups by conjugation. Let $\Omega$ be a (nontrivial) orbit with $S\in\Omega$. If $P$ is a fixed point of the action restricted to $S$ then $S$ normalizes $P$ and $PS=SP$ is a $p$-group. Hence $P=S$ by maximality of both $P$ and $S$, and $S$ has a unique fixed point. As $S$ is a $p$-group, all its orbits have order $1$ or a multiple of $p$, in particular $|\mathrm{Syl}_p(G)| = 1 \bmod p$. All orbits of $G$ are disjoint unions of orbits of $S$ proving $\Omega = 1 \bmod p$ and $\Omega' = 0 \bmod p$ for all other orbits $\Omega'$ of $G$. This implies that $\Omega = \mathrm{Syl}_p(G)$, as $\Omega$ was an arbitrary nontrivial orbit of $G$, showing that the action of $G$ is transitive. The stabilizer of $S$ in $G$ is its normalizer $N_G(S)$, and as the action is transitive $|G:N_G(S)| = |\mathrm{Syl}_p(G)| = 1 \bmod p$. It remains to show that $p$ does not divide $|N_G(S):S|=|N_G(S)/S|$. Otherwise, by Cauchy’s theorem there exists a nontrivial $p$-subgroup of $N_G(S)/S$ whose preimage under the projection $N_G(S) \to N_G(S)/S$ is $p$-subgroup properly containing $S$ contradicting the maximality of $S$.