Is the category of categories a topos?
A (Grothendieck) 2-topos (a la Street) is a 2-category equivalent to the 2-category of sheaves on a 2-site. Such sheaves are 2-functors from a small 2-category to the small category of categories, where a 2-sheaf is defined analogously to a sheaf, but takes a few more steps to define: the limit in question is more complex, both because we're in a 2-category and because there are three levels of separation between presheaf and sheaf, instead of two as in the 1-categorical case.
Then Cat is simply the sheaves on a point with its canonical topology. Note this says nothing about the underlying 1-category.
I thought I would add here an elementary proof that $\mathbf{Cat}$ has no subobject classifier. (This would be as a complement to the other answers which redirect to an MO post, which further redirect to literature references.)
(I'm also posting this as a community wiki since it has minimal changes from my answer to the corresponding question for the category of groupoids, so it didn't seem appropriate to get additional reputation points from this copy.)
So, first note that the object functor $\operatorname{Ob} : \mathbf{Cat} \to \mathbf{Set}$ is representable by the category $1$ with one object and one (identity) morphism. Similarly, the arrows functor $\operatorname{Arr} : \mathbf{Cat} \to \mathbf{Set}$ is representable by the category $A$ with two objects $s,t$, and three morphisms (one identity morphism $s \to s$, one morphism $s \to t$, one identity morphism $t \to t$) with the unique composition. In particular, given any monomorphism $F : C \to D$ in $\mathbf{Cat}$, this implies that $\operatorname{Ob}(F)$ and $\operatorname{Arr}(F)$ are injective functions, from which it is straightforward to conclude that $F$ is a composition of an isomorphism from $C$ to a subcategory of $D$ with the inclusion functor into $D$. This implies that $\mathbf{Cat}$ is well-powered, with $\operatorname{Sub}(C)$ being the set of subcategories of $C$.
Now, suppose that $\mathbf{Cat}$ had a subobject classifier $\Omega$. Then we would have to have: $$\operatorname{Ob}(\Omega) \simeq \operatorname{Hom}(1, \Omega) \simeq \operatorname{Sub}(1) = \{ 1, \emptyset \}$$ and $$\operatorname{Arr}(\Omega) \simeq \operatorname{Hom}(A, \Omega) \simeq \operatorname{Sub}(A) = \{ A, A_d, \{ s \}, \{ t \}, \emptyset \}.$$ Here $A_d$ represents the subcategory with objects $s$ and $t$, and only the identity morphisms.
Also, the source and target morphisms $s, t : \operatorname{Arr} \to \operatorname{Ob}$ are induced by the functors $1 \to A$ corresponding to $s, t \in \operatorname{Ob}(A)$, respectively. From this, we can see that: $$ A \in \operatorname{Hom}_{\Omega}(1, 1) \\ A_d \in \operatorname{Hom}_{\Omega}(1, 1) \\ \{ s \} \in \operatorname{Hom}_{\Omega}(1, \emptyset) \\ \{ t \} \in \operatorname{Hom}_{\Omega}(\emptyset, 1) \\ \emptyset \in \operatorname{Hom}_{\Omega}(\emptyset, \emptyset).$$
We now consider the category corresponding to the poset $\{ 0, 1, 2 \}$ with the induced order, and the subcategory corresponding to the poset $\{ 0, 2 \}$. We then want to find a morphism $F : \{ 0, 1, 2 \} \to \Omega$ such that $\{ 0, 2 \}$ is the pullback of this morphism and $\top : 1 \to \Omega$ (which must correspond to the object 1 of $\Omega$). We must have $F(0) = 1, F(1) = \emptyset, F(2) = 0$ so $F(0 \le 1) = \{ s \}$, $F(1 \le 2) = \{ t \}$. Thus, $F(0 \le 2) = \{ t \} \circ \{ s \}$.
But if we repeat the same argument with the subcategory $\{ 0, 2 \}$ with no morphism $0 \to 2$, we will find that for this subobject also with corresponding functor $F' : \{ 0, 1, 2 \} \to \Omega$, $F' = F$. This contradicts the fact that $F'$ and $F$ must give different pullbacks.
Note that the construction of $\Omega$ above gives a perfectly good subobject classifier for the category of directed graphs. (In fact, this category is equivalent to the category of functors $AOst \to \mathbf{Set}$ where $AOst$ is the category with two objects $A,O$ generated by two morphisms $s, t : A \to O$, so it is a topos.) Given a subgraph $G'$ of $G$, to get the corresponding morphism $G \to \Omega$, we map objects of $G$ to 1 if they are in $G'$ and to $\emptyset$ otherwise. And for edges in $G$, if the source or target is not in $G'$ then the edge also cannot be in $G'$ and correspondingly, there is exactly one element of the $\operatorname{Hom}_{\Omega}$ set; otherwise, if the source and target are in $G'$, then we send that edge to $A$ if the edge is in $G'$, and to $A_d$ otherwise.
What the argument above says is essentially: if we try to extend this to categories, then the problem we get is that if neither $f$ nor $g$ is in the subcategory and neither is the common target of $f$ and source of $g$, but the source of $f$ and the target of $g$ are, then that is not enough information to determine whether $g \circ f$ is in the subcategory. However, if there were a subcategory classifier $\Omega$, it would end up having to agree with the subgraph classfier at the graph level, and then the composition law within that $\Omega$ would have to determine that $g \circ f$ is either always in the subcategory or never in it. (And we could make a similar argument for the case where the common target of $f$ and source of $g$ is in the subcategory.)