Looking for a complete exposition of the Burali-Forti paradox
I have now found a textbook that provides a complete proof of the Burali-Forti paradox without making use of von Neumann's definition of ordinals: "Basic Set Theory" by Azriel Levy. Before providing von Neumann's definition, he works just on the assumption that some "order types" can be defined such that the order type of two well-ordered sets is identicall iff they are order-isomorphic. Based only on this assumption, and, importantly for my concern, without using the Axiom of Foundation, he shows that the class of ordinals cannot be a set.
So, an awkward admission. I've never actually read a basic intro to ZFC, nor taken a course on the subject. So, while I suspect this can all be found in any basic text, I don't know which one to refer you to.
But it isn't hard to do all of this by hand, just tedious. I'll get you to the point of showing that ordinals, by your definition, are totally ordered. I think that's the part which is most different from the Von Neummann ordinal case. The rest is not too much more difficult, and I assume someone will be recommending a textbook soon anyway.
We'll write $X \preceq Y$ if $X$ and $Y$ are well ordered sets and there is an order preserving bijection between $X$ and an initial segment of $Y$.
Lemma 1 If $X$ and $Y$ are well ordered sets, there is at most one order preserving injection $X \to Y$ whose image is an initial interval.
Proof: Suppose there were two, call them $\phi_1$ and $\phi_2$. Since they are not the same, there is some smallest $x \in X$ such that $\phi_1(x) \neq \phi_2(x)$ (using that $X$ is well ordered). Let $Y' = \{ \phi_1(x'): x' < x \}$. Since $\phi_1(x) \not \in Y'$, the set $Y \setminus Y'$ is not empty, let its least member be $y$. Then either $\phi_1(x)$ or $\phi_2(x)$ is not $y$; say WLOG $\phi_1(x) \neq y$. Since $\phi_1(x) \not \in Y'$, we deduce that $\phi_1(x) > y$. Now, consider any $x' \in X$. If $x' < x$, then $\phi_1(x') \in Y'$ and $\phi_1(x') \neq y$; if $x' \geq x$ then $\phi_1(x') \geq \phi_1(x) > y$. So $y$ is not in the image of $\phi_1$, but $\phi_1(x) >y$ is. This contradicts that the image of $\phi_1$ is an initial interval. QED
We'll write $X \preceq_{\phi} Y$ to mean that $\phi$ is an order preserving map $X \to Y$ whose image is an initial segment. So $X \preceq Y$ if and only if $X \preceq_{\phi} Y$ for some $\phi$.
Corollary: If $X \preceq Y$ and $Y \preceq X$ then $X$ and $Y$ are isomorphic posets.
Proof: Let $X \preceq_{\phi} Y$ and $Y \preceq_{\psi} X$. Then $X \preceq_{\psi \circ \phi} X$. But also $X \preceq_{\mathrm{Id}} X$. So $\psi \circ \phi = \mathrm{Id}$. Similarly, $\phi \circ \psi = \mathrm{Id}$. So $\phi$ and $\psi$ are mutually inverse order preserving maps. QED
Thus, we see that that the ordinals are partially ordered under $\preceq$. (Of course, expressing this concept takes us out of the language of ZFC, since it is a statement about classes.) We will next show that this partial order is total.
Prop 2 Let $X$ and $Y$ be well ordered sets. Then either $X \preceq Y$ or $Y \preceq X$.
Proof: Consider $X' := \{ x \in X : X_{\leq x} \preceq Y \}$.
Consider the follow subset of $X' \times Y$:
$$\Phi = \{ (x,y) : \ \exists x' \in X,\ x \leq x',\ \exists \phi:\ X_{\leq x'} \preceq_{\phi} Y \ \mbox{and} \ y=\phi(x) \}$$
We claim that $\Phi$ is a function. In other words, we claim that, for each $x \in X'$, there is exactly one $y$ such that $(x, y) \in \Phi$. There is at least one such $y$ because we can take $x'=x$ and, by the definition of $X'$, there will be map $\phi$ such that $X_{\leq x} \preceq_{\phi} Y$; take $y= \phi(x)$. To see that there is not more than one $y$ above $x$, suppose that there were $y_1$ and $y_2$. They would correspond to some $(x'_1, \phi_1)$, $(x'_2, \phi_2)$. (No axiom of choice here, I'm only making finitely many choices!) WLOG, say $x'_1 \leq x'_2$. Let $\phi'_2$ be the restriction of $\phi_2$ to $X_{\leq x'_1}$. Then $X_{\leq x'_1} \preceq_{\phi_1} Y$ and $X_{\leq x'_1} \preceq_{\phi'_2} Y$. So, by lemma 1, $\phi_1=\phi'_2$. Then $\phi_1(x) = \phi'_2(x)$ which is to say, $y_1=y_2$.
So, $\Phi$ is a function. It is now easy to check (details left to you) that $X' \preceq_{\Phi} Y$. If $X=X'$, we are done. If not, let $Y' = \Phi(X')$. If $Y=Y'$, then $\Phi$ is injective and surjective, so its inverse is a function and we have $Y \preceq_{\Phi^{-1}} X$. If $X \neq X'$ and $Y \neq Y'$, then let $x$ and $y$ be the minimal elements of $X \setminus X'$ and $Y \setminus Y'$ (since $X$ and $Y$ are well ordered). Define $\phi'$ on $X_{\leq x}$ to be $\phi$ on $X_{<x} = X'$ and by $\phi(x)=y$. Then $\phi$ is easily checked to be order preserving and have image an initial interval, so $X_{\leq x} \preceq Y$. This contradicts that we took $x$ not to be in $X'$, and we are done. QED
At this point, we see that equivalence classes of well-ordered sets form a total order under $\preceq$.