Stationary distribution of a Markov process defined on the space of permutations
Your conjecture is correct. In fact, provided $0 < p < 1$, the Markov process is recurrent and reversible with unique stationary distribution proportional to $$ \pi_\sigma = \Bigl(\frac{p}{1-p}\Bigr)^{\ell(\sigma)},$$ where $\ell(\sigma)$ is the Coxeter length of $\sigma$. (This is the number of 'misrankings' in your question.)
Proof. Suppose that $0 < p < 1$. Let $p_{\sigma\tau}$ be the probability of a step from $\sigma \in S$ to $\tau\in S$. We solve the detailed balance equations $\pi_\sigma p_{\sigma\tau} = \pi_\tau p_{\tau\sigma}$. Suppose that $$\tau = \sigma_1 \ldots \sigma_{i+1} \sigma_i \ldots \sigma_n$$ where $1 \le i < \le n$. Then we step from $\sigma$ to $\tau$ with probability either $p/n$ or $(1-p)/n$. Explicitly,
\begin{align*} n \pi_\sigma p_{\sigma\tau} &= \Bigl(\frac{p}{1-p}\Bigr)^{\ell(\sigma)} \begin{cases} p & \text{if $\sigma_i < \sigma_j$} \\ 1-p & \text{if $\sigma_i > \sigma_j$} \end{cases}\\ &= \Bigl(\frac{p}{1-p}\Bigr)^{\ell(\sigma)} \begin{cases} \frac{p}{1-p} (1-p) &\text{if $\sigma_i < \sigma_j$} \\ \frac{1-p}{p} p & \text{if $\sigma_i > \sigma_j$} \end{cases} \\ &= \Bigl(\frac{p}{1-p}\Bigr)^{\ell(\tau)} \begin{cases} 1-p & \text{if $\tau_i > \tau_j$} \\ p & \text{if $\tau_i < \tau_j$} \end{cases} \\ &= n\pi_\tau p_{\tau\sigma}. \end{align*} If $\tau$ is not of this form and $\tau \not= \sigma$ then $p_{\sigma\tau} = p_{\tau\sigma} = 0$. Hence the detailed balance equations hold. You observed in your question that there is a single communicating class of states. The walk is aperiodic because there is a positive chance of staying put at each step. Hence the invariant distribution is unique. $\quad \Box$
For completeness, suppose that $p=0$ or $p=1$ and that the walk starts at $\sigma \in S$. It is clear that if $p=0$ then after $\ell(\sigma)$ steps the walk reaches the identity permutation; if $p = 1$ then after $\binom{n}{2} - \ell(\sigma)$ steps the walk reaches the order reversing permutation $1 \mapsto n$, $2 \mapsto n-1$, $\ldots$, $n\mapsto 1$ of maximum Coxeter length. The only randomness arises from the order in which inversions are removed/added.
Remark In another version of the problem, we step according to a general transposition $(i,j)$ chosen uniformly at random. In this case the process is not reversible. When $n \le 3$ the invariant distribution depends only on the Coxeter length: for example if $n=3$ then, ordering permutations $123,213,132,312,231,321$, the transition matrix is
$$\frac{1}{3} \left( \begin{matrix} 3(1-p) & p & p & 0 & 0 & p \\ 1-p & 2-p & 0 & p & p & 0 \\ 1-p & 0 & 2-p & p & p & 0 \\ 0 & 1-p & 1-p & 1+p & 0 & p \\ 0 & 1-p & 1-p & 0 & 1+p & p \\ 1-p & 0 & 0 & 1-p & 1-p & 3p \end{matrix} \right). $$
A computer algebra calculation shows that the invariant distribution is proportional to $(\alpha,\beta,\beta,\gamma,\gamma,\delta)$ where $\alpha = (1-p)(6-11p+7p^2)$, $\beta = (1-p)p(8-7p)$, $\gamma = (1-p)p(1+7p)$ and $\delta = p(2-3p+7p^2)$. For $n=4$ the invariant distribution is more complicated. For example, if $p=3/4$ then the invariant probabilities for $2134$ and $1324$ are $5325/485760$ and $8749/485760$, respectively.
This stationary distribution is known as the Mallows measure, see e.g. the references in http://www.sc.ehu.es/ccwbayes/members/ekhine/tutorial_ranking/data/slides.pdf
For the Markov chain connection, see e.g.
https://math.uchicago.edu/~shmuel/Network-course-readings/MCMCRev.pdf
and "Sampling and Learning Mallows and Generalized Mallows Models Under the Cayley Distance" Methodology and Computing in Applied Probability March 2018, Volume 20, Issue 1, pp 1–35 |