What is the homotopy theory of categories?
I am not knowledgeable enough to have much to say I have not writen in my answer to a previous question of yours, and I think that David Roberts's answer (or, rather immodestly, my previous one) provides what you were looking for as regards your first question. Just a few additional small points:
Pursuing Stacks is not a letter. See Tim Porter's comment.
As regards Grothendieck's opinion of Thomason's model structure, I do not know. Actually, I am unsure he knew of Thomason's model structure when writing Pursuing Stacks [EDIT: see Tim Porter's comment below]. What he knew for sure was that the localization of $Cat$ with respect to classical weak equivalences (functors between small categories the nerve of which are simplicial weak equivalences) is equivalent to the classical homotopy category. The first proof is due to Quillen and Illusie "wrote the details" (his words) in his thesis. (And there is a quite simpler proof, by the way.) Model structures crop up in Pursuing Stacks at some point, but I am pretty sure the idea is not developed in the beginning, which is much more concerned with mere models for homotopy types. Here is a citation from Chapter 75: "the notion of asphericity structure — which, together with the closely related notion of contractibility structure, tentatively dealt with before, and the various "test notions" (e.g. test categories and test functors) seems to me the main payoff so far of our effort to come to a grasp of a general formalism of "homotopy models"." (Beware: these asphericity structures are not what Maltsiniotis called "asphericity structures" in his own work.)
Another fact Grothendieck knew was, of course, Quillen's Theorem A. It seems he did not write a detailed proof of the relative version, but he gave a sketch of a toposic proof of it, though, and took it as an axiom for what he called basic localizer.
As for your second question, I do not know, but it seems to me that Grothendieck was not that interested in simplicial sets and thus did not work extensively with them. In a 1991 letter to Thomason, he wrote: " D’autre part, pour moi le "paradis originel" pour l’algèbre topologique n’est nullement la sempiternelle catégorie ∆∧ semi-simpliciale, si utile soit-elle, et encore moins celle des espaces topologiques (qui l’une et l’autre s’envoient dans la 2-catégorie des topos, qui en est comme une enveloppe commune), mais bien la catégorie Cat des petites catégories, vue avec un œil de géomètre par l’ensemble d’intuition, étonnamment riche, provenant des topos. En effet, les topos ayant comme catégories des faisceaux d’ensembles les C∧ , avec C dans Cat, sont de loin les plus simples des topos connus, et c’est pour l’avoir senti que j’insiste tant sur l’exemple de ces topos ("catégoriques") dans SGA 4 IV". (See here.)
To conclude, let me mention that, if one takes Grothendieck's viewpoint of homotopical algebra, there should exist not only a homotopy theory of categories, but a homotopy theory of $n$-categories. In this respect, there should be a "relative Theorem A" for every $n$, which should allow one to define a workable notion of "basic $n$-localizer". (Actually, this is already done for $n=2$: see this paper by Bullejos and Cegarra for Theorem A.) And then one should work out a theory of test $n$-categories, whose $(n-1)-Cat$-valued presheaves should be models for homotopy types, and so on. To sum up, what Grothendieck wanted to do amounts to giving new foundations for homotopical algebra, and this is still a work in progress.
David Roberts gives the two most useful available references in his answer. If you want to read Grothendieck's words (and in English), just wait for the upcoming annotated version of Pursuing Stacks.
EDIT (2013/10/29): Rereading this answer, I realize that I should add something of which I was not aware at the time of my writing, still regarding Grothendieck's knowledge of Thomason's model category structure (see also Tim Porter's comment and David Roberts's answer). An annotated version of section 69 of Pursuing Stacks is available at http://www.math.jussieu.fr/~maltsin/groth/ps/ps-69.pdf. On page 4, Grothendieck writes that "it appears very doubtful still that (Cat) is a “model category” in Quillen’s sense, in any reasonable way (with W of course as the set of “weak equivalences”". Thus, he was not aware of the existence of Thomason's structure then. See also note 6 on that same page: Grothendieck has learnt of the existence of Thomason's model structure between the writing of Sections 69 and 87.
This is just to answer your first question. The second one I don't know about.
The homotopy theory of categories is not quite as you envisage it. Really Grothendieck is thinking of the Thomason model structure on $Cat$ (the category of small categories), which is Quillen equivalent to the Quillen model structure on $sSet$ via the nerve functor. Then Grothedieck considered pairs $(Cat,W)$ where $W$ is a class of functors which acted as weak equivalences. This he called a basic localizer (nLab). Grothendieck conjectured, and Cisinski proved, that the class of weak equivalences in the Thomason model structure was the smallest basic localizer.
From there Grothendieck moved to considering pairs $(C,W)$ for any category $C$ and class $W$ of arrows such that $C[W^{-1}]$ was equivalent to the homotopy category of CW-complexes, or even the homotopy category of some basic localizer, and in particular he was interested in when $C = Pre(S) = Cat(S^{op},Set)$, presheaves on some small category $S$. In particular, we know that $S=\Delta$ can be used to recover the homotopy theory of CW-complexes. The question was to characterise those $S$ such that $(Pre(S),W')$, where $W'$ was inherited from a basic localizer (consult Cisinski's or Maltsiniotis' work for details), can be used to model the same homotopy types as $Cat$. Such categories $S$ were called [weak/strict] test categories.
D. Cisinski, Les préfaisceaux comme modèles des types d’homotopie, Astérisque 308 (2006)
and
G. Maltsiniotis, La théorie de l’homotopie de Grothendieck, Astérisque, 301 (2005)
are central resources in this area.
The first question was already answered David Roberts and Jonathan Chiche. Let me address the second one. It's not reasonable to expect that such a model structure exists. We can ask instead whether there is a model structure on simplicial sets in which nerves of fibrant categories (in Thomason's model structure) are fibrant. And, in fact, such a model structure does exist. We can just transfer the Quillen model structure on simplicial sets along the double subdivision functor. Note that the Thomason's model structure can be now transferred from this new model structure. Also, this model structure on simplicial sets presents the homotopy theory of spaces since weak equivalences in it are just ordinary weak equivalences.
Now, just for completeness, let me answer the question in the post. Of course, there are are a lot of model structures on simplicial sets in which the nerves of categories are fibrant. For example, we can take the left Bousfield localization of the Joyal model structure with respect to the maps $\Delta^n \amalg_{\Lambda^n_k} \Delta^n \to \Delta^n$. The trivial model structure also satsfies the conditions. These model structures do not present the homotopy theory of spaces. If we want to keep the class of weak equivalences the same as in the Quillen model structures, then I can show that there is no model structure in which nerves of categories are fibrant. Actually, I will prove a stronger statement:
If there is a model structure on simplicial sets in which $\Delta^1$ is fibrant and contractible, then its homotopy category is thin.
First, note that if $f : X \to Y$ is a weak equivalence between fibrant objects and $A$ is cofibrant, then $f$ has the weak right lifting property with respect to $A$. That is, for every map $g : A \to Y$, there is a map $g' : A \to X$ such that $f \circ g'$ is homotopic to $g$.
Now, let $A$ be a cofibrant simplicial set in the hypothetical model structure. Consider the inclusion of the left endpoint $f : \Delta^0 \to \Delta^1$. It is a weak equivalence between fibrant objects by the assumptions. Let $g : A \to \Delta^1$ be the constant map at the other endpoint. Then the observation in the previous paragraph implies that $g$ is homotopic to $g'$, the constant map at the left endpoint. Let $A \amalg A \to C(A)$ be a cylinder object for $A$. The previous observation implies that there are no 1-simplices in $C(A)$ between two components (if there is, then $g$ and $g'$ cannot be homotopic).
Thus, any cylinder object $A \amalg A \to C(A)$ equals to $s_1 \amalg s_2 : A \amalg A \to A_1 \amalg A_2$ for some maps $s_1 : A \to A_1$ and $s_2 : A \to A_2$. Moreover, there are retractions $r_1 : A_1 \to A$ and $r_2 : A_2 \to A$ of $s_1$ and $s_2$, respectively. Thus, two maps $f,g : A \to B$ are homotopic if and only if $f$ factors through $s_1$ and $g$ factors through $s_2$. But, since $s_1$ and $s_2$ have retractions, all maps factor through them, so any two maps are homotopic. This implies that any two maps in the homotopy category are equal.