Why did no student correctly find a pair of $2\times 2$ matrices with the same determinant and trace that are not similar?
If $A$ is a $2\times 2$ matrix with determinant $d$ and trace $t$, then the characteristic polynomial of $A$ is $x^2-tx+d$. If this polynomial has distinct roots (over $\mathbb{C}$), then $A$ has distinct eigenvalues and hence is diagonalizable (over $\mathbb{C}$). In particular, if $d$ and $t$ are such that the characteristic polynomial has distinct roots, then any other $B$ with the same determinant and trace is similar to $A$, since they are diagonalizable with the same eigenvalues.
So to give a correct example in part (2), you need $x^2-tx+d$ to have a double root, which happens only when the discriminant $t^2-4d$ is $0$. If you choose the matrix $A$ (or the values of $t$ and $d$) "at random" in any reasonable way, then $t^2-4d$ will usually not be $0$. (For instance, if you choose $A$'s entries uniformly from some interval, then $t^2-4d$ will be nonzero with probability $1$, since the vanishing set in $\mathbb{R}^n$ of any nonzero polynomial in $n$ variables has Lebesgue measure $0$.) Assuming that students did something like pick $A$ "at random" and then built $B$ to have the same trace and discriminant, this would explain why none of them found a correct example.
Note that this is very much special to $2\times 2$ matrices. In higher dimensions, the determinant and trace do not determine the characteristic polynomial (they just give two of the coefficients), and so if you pick two matrices with the same determinant and trace they will typically have different characteristic polynomials and not be similar.
As Eric points out, such $2\times2$ matrices are special. In fact, there are only two such pairs of matrices. The number depends on how you count, but the point is that such matrices have a very special form.
Eric proved that the two matrices must have a double eigenvalue. Let the eigenvalue be $\lambda$. It is a little exercise1 to show that $2\times2$ matrices with double eigenvalue $\lambda$ are similar to a matrix of the form $$ C_{\lambda,\mu} = \begin{pmatrix} \lambda&\mu\\ 0&\lambda \end{pmatrix}. $$ Using suitable diagonal matrices shows that $C_{\lambda,\mu}$ is similar to $C_{\lambda,1}$ if $\mu\neq0$. On the other hand, $C_{\lambda,0}$ and $C_{\lambda,1}$ are not similar; one is a scaling and the other one is not.
Therefore, up to similarity transformations, the only possible example is $A=C_{\lambda,0}$ and $B=C_{\lambda,1}$ (or vice versa). Since scaling doesn't really change anything, the only examples (up to similarity, scaling, and swapping the two matrices) are $$ A = \begin{pmatrix} 1&0\\ 0&1 \end{pmatrix}, \quad B = \begin{pmatrix} 1&1\\ 0&1 \end{pmatrix} $$ and $$ A = \begin{pmatrix} 0&0\\ 0&0 \end{pmatrix}, \quad B = \begin{pmatrix} 0&1\\ 0&0 \end{pmatrix}. $$ If adding multiples of the identity is added to the list of symmetries (then scaling can be removed), then there is only one matrix pair up to the symmetries.
If you are familiar with the Jordan normal form, it gives a different way to see it. Once the eigenvalues are fixed to be equal, the only free property (up to similarity) is whether there are one or two blocks in the normal form. The Jordan normal form is invariant under similarity transformations, so it gives a very quick way to solve problems like this.
1 You only need to show that any matrix is similar to an upper triangular matrix. The eigenvalues (which now coincide) are on the diagonal. You can skip this exercise if you have Jordan normal forms at your disposal.
I'm going to answer your title question—Why did no student get the correct answer?—, rather than what's in your post, because while it's worthwhile to expound on why what they did failed, it completely overlooks the possibility that they were not taught how to handle such a problem in the first place.
Finding examples is a technique and art all of its own, and needs some instruction to reliably execute on exams (or anywhere else). So you may be passing the buck. Certainly sometimes you are a perfect teacher for a problem and the class still fails to get it, because sometimes by random (mis)fortune your class just happens to be full of students who have difficulties for reasons independent of you. But it's worthwhile to wonder if maybe there's something about how they were taught that needs work. You don't seem to address this possibility in your post, so here's a bunch of possibilities to mull over...
The basic line of thought that should ideally occur to them is "dang, figuring out whether two matrices are similar or not is usually a pretty involved process. I don't have the time for that. Maybe there are matrices which aren't similar to very many others...maybe ones that are only similar to themselves...well $PAP^{-1}$ is always easy to compute if $A$...", at which point hopefully they recognize they want the identity matrix, or a zero matrix, or more generally a scalar multiple of the identity (aka: the center of the matrix ring).
Students usually need to be taught the "usual suspects" for (counter)examples. In most fields of math, when wondering if something is true or not, there tends to be a certain object or set of objects which are known to often provide (counter)examples, or for which computations are relatively simple, and the first litmus test for a proposed theorem is to check it against the usual suspects to make sure it holds up. When introducing similar matrices, did you specifically point out (or assign to them as work in some fashion) what the identity and zero matrices are similar to? What scalar multiples of the identity are similar to?
When discussing similarity, did you make it clear that it is very hard to tell "at a glance" if two matrices are similar or not? That "very different looking" matrices can turn out to be similar? That determining if two fixed, small matrices are similar or not can be reasonable (for an exam), but with more matrices (in this case: literally every matrix they can think of) and more dimensions it becomes increasingly difficult to just brute force things?
Did you ever give examples of non-similar (2x2) matrices with identical trace and determinant in class? They might have been able to memorize such examples, and if they did then hopefully while doing so they incidentally picked up something about "why" these examples have the desired properties. Especially so if they were informed ahead of time that they are expected to show work, and not simply write down answers, to get full credit.
I'm assuming this question came from an exam or quiz. Is this question something you came up long before the exam/quiz date came around? Many instructors I know find it beneficial to write up exams (very) far in advance of when the actual exam is held, so that they can clearly identify: "Obviously I want them to know X, so I better make it a point to teach them how to do X." An exam/quiz created too close to the exam date runs the risk of you asking something you never actually taught them how to do; and, quite importantly, runs the risk of you not noticing this until it's too late. It's easy to think, "Yeah, they know everything they need to know for this, it'll be an easy problem" soon after writing the problem, but a week or two later you might realize you're wrong. If you've written well in advance, you can adjust the problem or lectures to fix this. If you haven't, well, then you've got an entire chunk of points on your exam that were exactly meaningless and a class that has a mistaken impression of their own abilities and a dissatisfaction with your ability to write a fair exam.