Relation of trace and determinant
Hint: $A^2=I$ implies the minimal polynomial is $(X-1)(X+1)$ because $A$ is not diagonal and this limits the characteristic polynomial to $(X-1)^2(X+1)$ or $(X-1)(X+1)^2$.
Since $(A-I)(A+I) = 0$ we know all eigenvalues are $\pm 1$.
If all eigenvalues were the same then $A$ would be $\pm I$ which is disallowed, hence at least one eigenvalue must have sign different from the others. Ignoring order, the only two possibilities are $1,1,-1$ or $-1,-1,1$.
Since
$A = A^{-1}, \tag 1$
it follows that
$A^2 = I, \tag 2$
or
$A^2 - I = 0; \tag 3$
if $\lambda$ is an eigenvector of $A$ with eigenvector $v$, $Av = \lambda v$ with $v \ne 0$, we have
$(\lambda^2 - 1)v = \lambda^2 v - v = A^2 v - Iv = (A^2 - I)v = 0, \tag 4$
so $v \ne 0$ yields
$\lambda^2 - 1 = 0; \tag 5$
it follows that every eigenvalue of $A$ is $\pm 1$. Since
$A^2 = I, \tag 6$
we also have
$(\det A)^2 = \det A^2 = \det I = 1, \tag 7$
so
$\det A = \pm 1. \tag 8$
We next invoke the hypothesis that $A$ is not a diagonal matrix to show that the eigenvalues of $A$ cannot all be the same; this will rule out the cases in which the eigenvalues are $1, 1, 1$ or $-1, -1, -1$. In so doing we will in fact formally validate copper.hat's statement, made without proof in his answer, that "If all eigenvalues were the same then $A$ would be $\pm I$," and also lhf's assertion that " the minimal polynomial is $(X-1)(X+1)$ because $A$ is not diagonal"; here we will show why both these affirmations are in fact correct.
So let $\lambda = \pm 1$, and assume the eigenvalues of $A$ are all $\lambda$. Consider then the characteristic polynomial of the matrix $A$,
$p_A(x) = \det (A - xI); \tag 9$
we know that
$p_A(\lambda) = \det(A - \lambda I) = 0; \tag{10}$
in addition, the Cayley-Hamilton theorem gives us
$p_A(A) = 0; \tag{11}$
we also have that $p_A(x)$ splits into $3$ linear factors $x - \lambda$, whence
$p_A(x) = (x - \lambda)(x - \lambda)(x - \lambda) = (x - \lambda)^3; \tag{12}$
combining (11) and (12) we find
$(A - \lambda I)^3 = p_A(A) = 0; \tag{13}$
if we set
$N = A - \lambda I, \tag{14}$
since $A$ is not diagonal, it follows that
$N \ne 0; \tag{15}$
from (13), we see that $N$ is nilpotent:
$N^3 = 0. \tag{16}$
We do not have enough information to determine directly whether $N^2 = 0$ or not in the above, but we can prove that
$N^2 = 0 \Longrightarrow I, N \; \text{are linearly independent}; \tag{17}$
$N^2 \ne 0 \Longrightarrow I, N, N^2 \; \text{are linearly independent}; \tag{18}$
for instance, to see that (18) holds we assume
$\alpha I + \beta N + \gamma N^2 = 0, \tag{19}$
and multiply by $N^2$,
$\alpha N^2 + \beta N^3 + \gamma N^4 = 0; \tag {20}$
now using (16) we see that
$\alpha N^2 = 0 \Longrightarrow \alpha = 0, \tag{21}$
so that (19) becomes
$\beta N + \gamma N^2 = 0, \tag{22}$
which if multiplied by $N$ yields
$\beta N^2 + \gamma N^3 = \beta N^2 = 0, \tag{23}$
whence
$\beta = 0; \tag{24}$
now inserting $\alpha = \beta = 0$ in (19) we see that
$\gamma = 0 \tag{25}$
as well; thus $I$, $N$, and $N^2$ are linearly independent as asserted. In the case $N^2 = 0$ we start from the assumption that
$\alpha I + \beta N = 0, \tag{26}$
multiply by $N$,
$\alpha N = \alpha N + \beta N^2 = 0, \tag{27}$
which again implies $\alpha = 0$; $\beta = 0$ is then immeditate from (26). So we see that (17)-(18) bind.
We return to (14) and write
$A = \lambda I + N; \tag{28}$
then with (2) we have
$I = A^2 = (\lambda I + N)^2 = \lambda^2 I^2 + 2\lambda N + N^2 = I \pm 2 N + N^2, \tag{29}$
using the fact that $\lambda = \pm 1$. (29) yields
$\pm N + N^2 = 0; \tag{30}$
if $N^2 = 0$, (30) asserts $N = 0$, which is clearly not possible; if $N^2 \ne 0$, (30) contradicts the linear independence of $N$, $N^2$. Since either of these conclusions follows directly from the hypothesis $A$ has three identical eigenvalues, we see that such an assumption is disallowed; thus, the eigenvalues of $A$ are either $1, 1, -1$ or $1, -1, -1$. Int the former case, $\text{Tr}(A) = 1$: in the latter, $\text{Tr}(A) = -1$; in any case, we see that
$\text{Tr}(A) = \pm 1 \tag{31}$
binds.
Note Added in Edit, Thursday 26 October 2017 2:36 PM PST: The above argument reveals general principles which may be proved along lines similar to the above: If $A$ is an $n \times n$ non-diagonal matrix all of the eigenvalues $\lambda$ of which are equal, then $N = A - \lambda I \ne 0$ is in fact a nilpotent matrix with $N^n = 0$; if $N$ also satisfies $N^m = 0$ with $1 < m \le n$, but $A^l \ne 0$ for $l < m$ (so that $m$ is the least positive integer for which $A^m = 0$), then the matrices $I, N, N^2, \ldots, N^{m - 1}$ are in fact linearly independent. Using such principles, it may be possible to address certain generalizaions of the present problem, such as the case $A^k = rI$ for non-diagonal $A$. But I will leave such questions for the present time. End of Note.