How exactly are Lorentz transformations rotations?
The relevant generalized notion of "rotation" is that a rotation is a transformation that fixes one point and preserves all distances. In euclidean space, this means that if you have two points with x-coordinates differing by $\Delta x$ and y-coordinates ďiffering by $\Delta y$, then the value of $\Delta x^2+\Delta y^2$ is unaffected by a rotation. In Minkowski space it means that $\Delta x^2-\Delta t^2$ is unaffected.
A Lorentz boost is not a rotation by a real angle. Instead. it is a strain by a real angle. The transformation of the x,t axis where both move inward by a small angle in radians $d\lambda$ (called the Lorentz boost parameter) is well known to mechanical engineers in the x,y plane. The engineer distorts a square in the x,y plane so that both edges of the square move inward by a small angle in radians $d\epsilon$ (called the strain). The square becomes a parallelepiped. The matrices which do these transformations for non-infinitesimal angles are: $$ \begin{bmatrix} cosh(\lambda) & sinh(\lambda)\\ sinh(\lambda) & cosh(\lambda)\\ \end{bmatrix} \begin{bmatrix} x\\ ct\\ \end{bmatrix} \quad and \quad \begin{bmatrix} cosh(\epsilon) & sinh(\epsilon)\\ sinh(\epsilon) & cosh(\epsilon)\\ \end{bmatrix} \begin{bmatrix} x\\ y\\ \end{bmatrix} $$ The reason you have heard that boosts are somehow rotations is that old time physicists made boosts look like familiar rotations by using imaginary angles and making t imaginary.
$$ \begin{bmatrix} x'\\ ict'\\ \end{bmatrix} = \begin{bmatrix} cos(i\lambda) & -sin(i\lambda)\\ sin(i\lambda) & cos(i\lambda)\\ \end{bmatrix} \begin{bmatrix} x\\ ict\\ \end{bmatrix} $$ $$ \begin{bmatrix} x'\\ ict'\\ \end{bmatrix} = \begin{bmatrix} cosh(\lambda) & -i\ sinh(\lambda)\\ i\ sinh(\lambda) & cosh(\lambda)\\ \end{bmatrix} \begin{bmatrix} x\\ ict\\ \end{bmatrix} $$ $$ \begin{bmatrix} x'\\ ct'\\ \end{bmatrix} = \begin{bmatrix} cosh(\lambda) & sinh(\lambda)\\ sinh(\lambda) & cosh(\lambda)\\ \end{bmatrix} \begin{bmatrix} x\\ ct\\ \end{bmatrix} $$ Space-space parallelepiped strains leave $x^2-y^2$ invariant. Space-time parallelepiped strains leave $x^2-(ct)^2$ invariant. Rotations leave $x^2+(ict)^2$ invariant. Please see my answer to this question if you would like more math.
I don't think "the reason we don't treat the time-axis like a usual space-axis is that we can't move backwards in time" is a good argument. However, if $ct>x$ a boost can not make $ct'<x'$ because $x^2-(ct)^2$ is invariant. Thus if an event is causal in the forward light cone then it is causal in the forward light cone in all boosted frames too. A real rotation of x and real ct could turn $ct>x$ into $ct'<x'$ and screw up causality.