How to use the Extended Euclidean Algorithm manually?
Perhaps the easiest way to do it by hand is in analogy to Gaussian elimination or triangularization, except that, since the coefficient ring is not a field, one has to use the division / Euclidean algorithm to iteratively decrease the coefficients till zero. In order to compute both $\rm\,gcd(a,b)\,$ and its Bezout linear representation $\rm\,j\,a+k\,b,\,$ we keep track of such linear representations for each remainder in the Euclidean algorithm, starting with the trivial representation of the gcd arguments, e.g. $\rm\: a = 1\cdot a + 0\cdot b.\:$ In matrix terms, this is achieved by augmenting (appending) an identity matrix that accumulates the effect of the elementary row operations. Below is an example that computes the Bezout representation for $\rm\:gcd(80,62) = 2,\ $ i.e. $\ 7\cdot 80\: -\: 9\cdot 62\ =\ 2\:.\:$ See this answer for a proof and for conceptual motivation of the ideas behind the algorithm (see the Remark below if you are not familiar with row operations from linear algebra).
For example, to solve m x + n y = gcd(m,n) one begins with
two rows [m 1 0], [n 0 1], representing the two
equations m = 1m + 0n, n = 0m + 1n. Then one executes
the Euclidean algorithm on the numbers in the first column,
doing the same operations in parallel on the other columns,
Here is an example: d = x(80) + y(62) proceeds as:
in equation form | in row form
---------------------+------------
80 = 1(80) + 0(62) | 80 1 0
62 = 0(80) + 1(62) | 62 0 1
row1 - row2 -> 18 = 1(80) - 1(62) | 18 1 -1
row2 - 3 row3 -> 8 = -3(80) + 4(62) | 8 -3 4
row3 - 2 row4 -> 2 = 7(80) - 9(62) | 2 7 -9
row4 - 4 row5 -> 0 = -31(80) +40(62) | 0 -31 40
The row operations above are those resulting from applying
the Euclidean algorithm to the numbers in the first column,
row1 row2 row3 row4 row5
namely: 80, 62, 18, 8, 2 = Euclidean remainder sequence
| |
for example 62-3(18) = 8, the 2nd step in Euclidean algorithm
becomes: row2 -3 row3 = row4 when extended to all columns.
In effect we have row-reduced the first two rows to the last two.
The matrix effecting the reduction is in the bottom right corner.
It starts as 1, and is multiplied by each elementary row operation,
hence it accumulates the product of all the row operations, namely:
$$ \left[ \begin{array}{ccc} 7 & -9\\ -31 & 40\end{array}\right ] \left[ \begin{array}{ccc} 80 & 1 & 0\\ 62 & 0 & 1\end{array}\right ] \ =\ \left[ \begin{array}{ccc} 2\ & \ \ \ 7\ & -9\\ 0\ & -31\ & 40\end{array}\right ] \qquad\qquad\qquad\qquad\qquad $$
Notice row 1 is the particular solution 2 = 7(80) - 9(62)
Notice row 2 is the homogeneous solution 0 = -31(80) + 40(62),
so the general solution is any linear combination of the two:
n row1 + m row2 -> 2n = (7n-31m) 80 + (40m-9n) 62
The same row/column reduction techniques tackle arbitrary
systems of linear Diophantine equations. Such techniques
generalize easily to similar coefficient rings possessing a
Euclidean algorithm, e.g. polynomial rings F[x] over a field,
Gaussian integers Z[i]. There are many analogous interesting
methods, e.g. search on keywords: Hermite / Smith normal form,
invariant factors, lattice basis reduction, continued fractions,
Farey fractions / mediants, Stern-Brocot tree / diatomic sequence.
Remark $ $ As an optimization, we can omit one of the Bezout coefficient columns (being derivable from the others). Then the calculations have a natural interpretation as modular fractions (though the "fractions" are multi-valued), e.g. computing $\,\color{#c00}{\large \frac{-10}9}\equiv\color{#90f}{18}\pmod{\!43}\,$ as in this answer
$$ \begin{array}{rr} \bmod 43\!:\ \ \ \ \ \ \ \ [\![1]\!] &43\, x\,\equiv\ \ 0\ \\ [\![2]\!] &\ \color{#c00}{9\,x\, \equiv -10}\!\!\!\\ [\![1]\!]-5\,[\![2]\!] \rightarrow [\![3]\!] & \color{#0a0}{-2\,x\, \equiv\ \ 7}\ \\ [\![2]\!]+\color{orange}4\,[\![3]\!] \rightarrow [\![4]\!] & \color{#90f}{1\,x\, \equiv 18}\ \end{array}\qquad\qquad\qquad$$
$$\dfrac{0}{43}\ \overset{\large\frown}\equiv \underbrace{\color{#c00}{\dfrac{-10}{9}}\ \overset{\large\frown}\equiv \ \color{#0a0}{\dfrac{7}{-2}}\ \overset{\large\frown}\equiv\ \color{#90f}{\dfrac{18}{1}}} _{\!\!\!\Large \begin{align}\color{#c00}{-10}\ \ + \ \ &\!\color{orange}4\,(\color{#0a0}{\ \, 7\ \, }) \ \ \equiv \ \ \color{#90f}{18}\\ \color{#c00}{9}\ \ +\ \ &\!\color{orange}4\,(\color{#0a0}{-2} ) \ \ \equiv\ \ \ \color{#90f}{1}\end{align}}\quad $$
We also used least magnitude remainders $\,(\color{#0a0}{-2}\,$ vs. $7\bmod 9)\,$ to shorten the computations (this can halve the number of steps in the Euclidean algorithm).
Introduction to row operations (for readers unfamiliar with linear algebra).
Let $\,r_i\,$ be the Euclidean remainder sequence. Above $\, r_1,r_2,r_3\ldots = 80,62,18\ldots$ Given linear combinations $\,r_j = a_j m + b_j n\,$ for $\,r_{i-1}\,$ and $\,r_i\,$ we can calculate a linear combination for $\,r_{i+1} := r_{i-1}\bmod r_i = r_{i-1} - q_i r_i\,$ by substituting said combinations for $\,r_{i-1}\,$ and $\,r_i,\,$ i.e.
$$\begin{align} r_{i+1}\, &=\, \overbrace{a_{i-1} m + a_{i-1}n}^{\Large r_{i-1}}\, -\, q_i \overbrace{(a_i m + b_i n)}^{\Large r_i}\\[.3em] {\rm i.e.}\quad \underbrace{r_{i-1} - q_i r_i}_{\Large r_{i+1}}\, &=\, (\underbrace{a_{i-1}-q_i a_i}_{\Large a_{i+1}})\, m\, +\, (\underbrace{b_{i-1} - q_i b_i}_{\Large b_{i+1}})\, n \end{align}$$
Thus the $\,a_i,b_i\,$ satisfy the same recurrence as the remainders $\,r_i,\,$ viz. $\,f_{i+1} = f_{i-1}-q_i f_i.\,$ This implies that we can carry out the recurrence in parallel on row vectors $\,[r_i,a_i,b_i]$ representing the equation $\, r_i = a_i m + b_i n\,$ as follows
$$\begin{align} [r_{i+1},a_{i+1},b_{i+1}]\, &=\, [r_{i-1},a_{i-1},b_{i-1}] - q_i [r_i,a_i,b_i]\\ &=\, [r_{i-1},a_{i-1},b_{i-1}] - [q_i r_i,q_i a_i, q_i b_i]\\ &=\, [r_{i-1}-q_i r_i,\ a_{i-1}-q_i a_i,\ b_{i-1}-b_i r_i] \end{align}$$
which written in the tabular format employed far above becomes
$$\begin{array}{ccc} &r_{i-1}& a_{i-1} & b_{i-1}\\ &r_i& a_i &b_i\\ \rightarrow\ & \underbrace{r_{i-1}\!-q_i r_i}_{\Large r_{i+1}} &\underbrace{a_{i-1}\!-q_i a_i}_{\Large a_{i+1}}& \underbrace{b_{i-1}-q_i b_i}_{\Large b_{i+1}} \end{array}$$
Thus the extended Euclidean step is: compute the quotient $\,q_i = \lfloor r_{i-1}/r_i\rfloor$ then multiply row $i$ by $q_i$ and subtract it from row $i\!-\!1.$ Said componentwise: in each column $\,r,a,b,\,$ multiply the $i$'th entry by $q_i$ then subtract it from the $i\!-\!1$'th entry, yielding the $i\!+\!1$'th entry. If we ignore the 2nd and 3rd columns $\,a_i,b_i$ then this is the usual Euclidean algorithm. The above extends this algorithm to simultaneously compute the representation of each remainder as a linear combination of $\,m,n,\,$ starting from the obvious initial representations $\,m = 1(m)+0(n),\,$ and $\,n = 0(m)+1(n).\,$
This is more a comment on the method explained by Bill Dubuque then a proper answer in itself, but I think there is a remark so obvious that I don’t understand that it is hardly ever made in texts discussing the extended Euclidean algorithm. This is the observation that you can save yourself half of the work by computing only one of the Bezout coefficients. In other words, instead of recording for every new remainder $r_i$ a pair of coefficients $k_i,l_i$ so that $r_i=k_ia+l_ib$, you need to record only $k_i$ such that $r_i\equiv k_ia\pmod b$. Once you will have found $d=\gcd(a,b)$ and $k$ such that $d\equiv ka\pmod b$, you can then simply put $l=(d-ka)/b$ to get the other Bezout coefficient. This simplification is possible because the relation that gives the next pair of intermediate coefficients is perfectly independent for the two coefficients: say you have $$ \begin{aligned} r_i&=k_ia+l_ib\\ r_{i+1}&=k_{i+1}a+l_{i+1}b\end{aligned} $$ and Euclidean division gives $r_i=qr_{i+1}+r_{i+2}$, then in order to get $$ r_{i+2}=k_{i+2}a+l_{i+2}b $$ one can take $k_{i+2}=k_i-qk_{i+1}$ and $l_{i+2}=l_i-ql_{i+1}$, where the equation for $k_{i+2}$ does not need $l_i$ or $l_{i+1}$, so you can just forget about the $l$'s. In matrix form, the passage is from $$ \begin{pmatrix} r_i&k_i&l_i\\ r_{i+1}&k_{i+1}&l_{i+1}\end{pmatrix} \quad\text{to}\quad \begin{pmatrix} r_{i+2}&k_{i+2}&l_{i+2}\\ r_{i+1}&k_{i+1}&l_{i+1}\end{pmatrix} $$ by subtracting the second row $q$ times from the first, and it is clear that the last two columns are independent, and one might as well just keep the $r$'s and the $k$'s, passing from $$ \begin{pmatrix} r_i&k_i\\ r_{i+1}&k_{i+1}\end{pmatrix} \quad\text{to}\quad \begin{pmatrix} r_{i+2}&k_{i+2}\\ r_{i+1}&k_{i+1}\end{pmatrix} $$ instead.
A very minor drawback is that the relation $r_i=k_ia+l_ib$ that should hold for every row is maybe a wee bit easier to check by inspection than $r_i\equiv k_ia\pmod b$, so that computational errors could slip in a bit more easily. But really, I think that with some practice this method is just as safe and faster than computing both coefficients. Certainly when programming this on a computer there is no reason at all to keep track of both coefficients.
A final bonus it that in many cases where you apply the extended Euclidean algorithm you are only interested in one of the Bezout coefficients in the first place, which saves you the final step of computing the other one. One example is computing inverse modulo a prime number $p$: if you take $b=p$, and $a$ is not divisible by it, then you know beforehand that you will find $d=1$, and the coefficient $k$ such that $d\equiv ka\pmod p$ is just the inverse of $a$ modulo $p$ that you were after.
You may like to check this and this.
Also, there is a well known table method which is very easy and fast for the manual solution.