Is the least-squares solution unique?
I think the reason we get different solutions here is because we're measuring different squares.
In the first equation, we want to minimize the distance from the vector on the right side made up of x-coordinates to the range of the matrix on the left side made up of y-coordinates and constants.
In the second equation, our vector has y-coordinates and we want to minimize the distance to the range of the matrix made up of x-coordinate and constants.
Yes, we're approximating the same data set with the same purpose in mind, so we get similar points, but we're working with different vectors and matrices in order to do that, and that gives us different least squares metrics to approximate this line with, meaning that we're going to get different answers because we're using different metrics, but similar answers because we're still approximating the same data set.