Importance of separability vs. second-countability
Separability can be used to study the Stone-Cech compactification of a countable discrete space. Recall that if $X$ is a discrete space, then the Stone-Cech compactification $\beta X$ of $X$ is precisely the set of ultrafilters on $X$. We can therefore use separability to prove facts about ultrafilters without mentioning ultrafilters.
First, we use separability to observe that $|\beta\mathbb{N}|=2^{2^{\aleph_{0}}}$. Since $\beta\mathbb{N}\subseteq P(P(\mathbb{N}))$ as the set of ultrafilters on $\mathbb{N}$, we have $|\beta\mathbb{N}|\leq|P(P(\mathbb{N}))|=2^{2^{\aleph_{0}}}$. To prove the other direction, let $I$ be a set of cardinality continuum. Then since the product of continuumly many separable spaces is separable, the product space $\{0,1\}^{I}$ is separable. Therefore let $A\subseteq\{0,1\}^{I}$ be a countable dense subset. Then there is a surjective function $f:\mathbb{N}\rightarrow A$. Therefore the function $f$ extends to a continuous function $\overline{f}:\beta\mathbb{N}\rightarrow\{0,1\}^{I}$. Since the image $\overline{f}[\beta\mathbb{N}]$ is a compact set, the set $\overline{f}[\beta\mathbb{N}]$ is a closed subset of $\{0,1\}^{I}$, so $\overline{f}[\beta\mathbb{N}]=\{0,1\}^{I}$ since $A\subseteq\overline{f}[\beta\mathbb{N}]$. Since $\overline{f}:\beta\mathbb{N}\rightarrow\{0,1\}^{I}$ is surjective, we have $2^{2^{\aleph_{0}}}=|\{0,1\}^{I}|\leq|\beta\mathbb{N}|$, so $|\beta\mathbb{N}|=2^{2^{\aleph_{0}}}$.
Separability may also be used to prove facts about the Rudin-Keisler ordering. The Rudin-Keisler ordering is the preordering $\leq_{RK}$ on the class of ultrafilters where if $\mathcal{U}\in\beta X,\mathcal{V}\in\beta Y$ are ultrafilters, then $\mathcal{U}\leq_{RK}\mathcal{V}$ if there is a continuous $f:\beta Y\rightarrow\beta X$ with $f(\mathcal{V})=\mathcal{U}$ and $f[Y]\subseteq X$. The motivation for the notion of the Rudin-Keisler ordering is that the Rudin-Keisler ordering measures the size of an ultrapower. In particular, $\mathcal{U}\leq_{RK}\mathcal{V}$ if and only if $\mathcal{A}^{\mathcal{U}}$ is elementarily embeddable in $\mathcal{A}^{\mathcal{V}}$ for each first order structure $\mathcal{A}$.
The Rudin-Keisler ordering is a pre-ordering on $\beta\mathbb{N}$. One can use separability to show that every subset of $\beta\mathbb{N}$ of size at most continuum has an upper bound in $\beta\mathbb{N}$. Assume that $I$ is an index set of cardinality at most continuum and $x_{i}\in\beta\mathbb{N}$ for $i\in I$. Then $\mathbb{N}^{I}$ is separable since the product of at most continuumly many separable spaces is separable, so there is a countable dense subset $A\subseteq\mathbb{N}^{I}$. Therefore let $f:\mathbb{N}\rightarrow A$ be a surjective function. Then $f$ extends to a unique continuous function $\overline{f}:\beta\mathbb{N}\rightarrow(\beta\mathbb{N})^{I}$. The function $\overline{f}$ is clearly surjective, so there is some $x\in\beta\mathbb{N}$ with $\overline{f}(x)=(x_{i})_{i\in I}$. Therefore if $\pi_{i}:(\beta\mathbb{N})^{I}\rightarrow\beta\mathbb{N}$ is the projection mapping, then $\pi_{i}\overline{f}(x)=x_{i}$ for $i\in I$, so $x_{i}\leq_{RK}x$ for $i\in I$.
It should be noted that there are very similar proofs of the above two results using independent sets and independent partitions (the proofs using independent sets and independent partitions are essentially the same proof. See Andreas Blass's comment below). Furthermore, the two above results can be generalized to larger cardinals with the same proofs. In particular, if $X$ is a discrete space, then $|\beta X|=2^{2^{|X|}}$. Furthermore, every subset of $\beta X$ of cardinality at most $2^{|X|}$ has an upper bound in $\beta X$. To prove these facts, one uses a generalization of the notion of separability called the density and the generalized proof is very similar to the original proof. If I remember correctly, the book The Theory of Ultrafilters by Comfort and Negrepontis also gives generalizations of these facts to large cardinals such as compact cardinals.
I hope the above results clear up any confusion about the importance of separability in non-metrizable spaces.
An arbitrary product of separable spaces satisfies Suslin´s condition (i.e. any disjoint family of open sets is countable). I find this result remarkable since separability is not preserved under (large) products while Suslin´s condition might or might not be preserved under (even finite) products, depending on the underlying axioms of set theory.
I second the idea that second countability is more fundamental than separability --- topologies are defined in terms of open sets, not points, and second countability is the natural "countability" condition on the family of open sets. It just says that the topology is countably generated.
Important theorems where separability is crucial: the basic example is that a continuous image of any separable space is separable. Even a quotient of a second countable space need not be second countable.
Is the popularity of the word/concept of "separability" just due to the special case of metric spaces? Yes, I think so. But that's a pretty important special case! I just finished writing a book on measure theory and functional analysis, and I found that by restricting attention to separable Banach spaces and their duals I was able to get by just fine without mentioning generalized convergence (nets/filters). For instance, the weak* topology is metrizable on the unit ball of the dual of a separable Banach space, which is good enough for most purposes by the Krein-Smulian theorem.