How does Cantor's diagonal argument work?
First, let me give you a proof of the following:
Let $\mathbb{N}$ be the natural numbers, $\mathbb{N}=\{1,2,3,4,5,\ldots\}$, and let $2^{\mathbb{N}}$ be the set of all binary sequences (functions from $\mathbb{N}$ to $\{0,1\}$, which can be viewed as "infinite tuples" where each entry is either $0$ or $1$).
If $f\colon\mathbb{N}\to 2^{\mathbb{N}}$ is a function, then $f$ is not surjective. That is, there exists some binary sequence $s_f$, which depends on $f$, such that $f(n)\neq s_f$ for all natural numbers $n$.
What I denote $2^{\mathbb{N}}$ is what Wikipedia calls $T$.
I will represent elements of $2^{\mathbb{N}}$ as tuples, $$(a_1,a_2,a_3,\ldots,a_n,\ldots)$$ where each $a_i$ is either $0$ or $1$; these tuples are infinite; we think of the tuple as defining a function whose value at $n$ is $a_n$, so it really corresponds to a function $\mathbb{N}\to\{0,1\}$. Two tuples are equal if and only if they are identical: that is, $$(a_1,a_2,a_3,\ldots,a_n,\ldots) = (b_1,b_2,b_3,\ldots,b_n,\ldots)\text{ if and only if } a_k=b_k\text{ for all }k.$$
Now, suppose that $f\colon\mathbb{N}\to 2^{\mathbb{N}}$ is a given function. For each natural number $n$, $f(n)$ is a tuple. Denote this tuple by $$f(n) = (a_{1n}, a_{2n}, a_{3n},\ldots,a_{kn},\ldots).$$ That is, $a_{ij}$ is the $i$th entry in $f(j)$.
I want to show that this function is not surjective. To that end, I will construct an element of $2^{\mathbb{N}}$ that is not in the image of $f$. Call this tuple $s_f = (s_1,s_2,s_3,\ldots,s_n,\ldots)$. I will now say what $s_k$ is. Define $$s_k = \left\{\begin{array}{ll} 1 &\mbox{if $a_{nn}=0$;}\\ 0 &\mbox{if $a_{nn}=1$.} \end{array}\right.$$
This defines an element of $2^{\mathbb{N}}$, because it defines an infinite tuple of $0$s and $1$s; this element depends on the $f$ we start with: if we change the $f$, the resulting $s_f$ may change; that's fine. (This is the "diagonal element").
Now, the question is whether $s_f = f(n)$ for some $n$. The answer is "no." To see this, let $n\in\mathbb{N}$ be any natural number. Then $$f(n) = (a_{1n},a_{2n},a_{3n},\ldots,a_{nn},\ldots)$$ so the $n$th entry of $f(n)$ is $a_{nn}$. If the $n$th entry of $f(n)$ is $0$, then by construction the $n$th entry of $s_f$, $s_n$ is $1$, so $f(n)\neq s_f$. If the $n$th entry of $f(n)$ is $1$, then by construction the $n$th entry of $s_f$, $s_n$, is $0$. Then $f(n)\neq s_f$ again, because they don't agree on the $n$th entry.
This means that for every $n\in\mathbb{N}$, $s_f$ cannot equal $f(n)$, because they differ in the $n$th entry. So $s_f$ is not in the image of $f$.
What we have shown is that given a function $f\colon\mathbb{N}\to 2^{\mathbb{N}}$, there is some element of $2^{\mathbb{N}}$ that is not in the image of $f$. The element depends on what $f$ is, of course; different functions will have possibly different "witnesses" to the fact that they are not surjective.
Think of the function $f$ being hauled before a judged and accused of Being Surjective; to prove its innocence, $f$ produces a witness to verify its alibi that it's not surjective; this witness is $s_f$, who can swear to the fact that $f$ is not surjective because $s_f$ demonstrates that $f$ is not surjective: $s_f$ is not in $\mathrm{Im}(f)$; if the police hauls in some other function $g$ and accuse that function of being surjective, $g$ will also have to produce a witness to verify its alibi that it isn't surjective; but that witness does not have to be the same witness that $f$ produced. The "witness" we produce will depend on who the "accused" is.
The reason this is called the "diagonal argument" or the sequence $s_f$ the "diagonal element" is that just like one can represent a function $\mathbb{N}\to \{0,1\}$ as an infinite "tuple", so one can represent a function $\mathbb{N}\to 2^{\mathbb{N}}$ as an "infinite list", by listing the image of $1$, then the image of $2$, then the image of $3$, etc: $$\begin{align*} f(1) &= (a_{11}, a_{21}, a_{31}, \ldots, a_{k1},\ldots )\\ f(2) &= (a_{12}, a_{22}, a_{32}, \ldots, a_{k2},\ldots)\\ &\vdots\\ f(m) &= (a_{1m}, a_{2m}, a_{3m},\ldots, a_{km},\ldots) \end{align*}$$ and if one imagines the function this way, then the way we construct $s_f$ is by "going down the main diagonal", looking at $a_{11}$, $a_{22}$, $a_{33}$, etc.
Now, remember the definition of "countable":
Definition. A set $X$ is said to be countable if and only if there exists a function $f\colon\mathbb{N}\to X$ that is surjective. If no such function exists, then $X$ is said to be uncountable.
That means that the theorem we proved above shows that:
Theorem. The set of all binary sequences, $2^{\mathbb{N}}$, is not countable.
Why? Because we showed that there are no surjective functions $\mathbb{N}\to 2^{\mathbb{N}}$, so it is not countable.
How does this relate to the real numbers? The real numbers are bijectable with the set $2^{\mathbb{N}}$. That is, there is a function $H\colon 2^{\mathbb{N}}$ to $\mathbb{R}$ that is both one-to-one and onto. If we had a surjection $\mathbb{N}\to\mathbb{R}$, then composing this surjection with $H$ we would get a surjection from $\mathbb{N}$ to $2^{\mathbb{N}}$, and no such surjection exists. So there can be no surjection from $\mathbb{N}$ to $\mathbb{R}$, so $\mathbb{R}$ is not countable (that is, it is uncountable).
Bijecting $\mathbb{R}$ with $2^{\mathbb{N}}$ is a bit tricky; you can first biject $\mathbb{R}$ with $[0,1]$; then you would want to use the binary representation (as in wikipedia's article), so that each sequence corresponds to a binary expansion, and each number in $[0,1]$ corresponds to a binary sequence (its digits when written in binary); the problem is that just like some numbers in decimal have two representations ($1$ and $0.999\ldots$ are equal), so do some numbers have two representations in binary (for example, $0.01$ and $0.0011111\ldots$ are equal). There is a way of fixing this problem, but it is a bit technical and may obscure the issue, so I would rather not get into it.
Instead, let me note that the set $2^{\mathbb{N}}$ can be mapped in a one-to-one way into $(0,1)$: simply take a binary sequence $$(a_1,a_2,a_3,\ldots,a_n,\ldots)$$ and map it to the decimal number that has a $5$ in the $k$th decimal position if $a_k=0$, and has a $6$ in the $k$th decimal position if $a_k=1$. Using $5$ and $6$ ensures that each number has only one decimal representation, so the map is one-to-one. Call this map $h$. Define $H\colon\mathbb{R}\to 2^{\mathbb{N}}$ as follows: given a real number $x$, if $x$ is in the image of $h$, then define $H(x)$ to be the unique sequence $s$ such that $h(s)=x$. If $x$ is not in the image of $h$, then define $H(x)$ to be the sequence $(0,0,0,\ldots,0,\ldots)$. Notice that $H$ is surjective, because $h$ is defined on all of $2^{\mathbb{N}}$.
This is enough to show that there can be no surjection from $\mathbb{N}$ to $\mathbb{R}$: suppose that $f\colon\mathbb{N}\to\mathbb{R}$ is any function. Then the function $H\circ f\colon \mathbb{N}\stackrel{f}{\to}\mathbb{R}\stackrel{H}{\to}2^{\mathbb{N}}$ is a function from $\mathbb{N}$ to $2^{\mathbb{N}}$. Since any function from $\mathbb{N}$ to $2^{\mathbb{N}}$ is not surjective, there is some $s\in 2^{\mathbb{N}}$ that is not in the image of $H\circ f$. Since $s$ is in the image of $H$, there exists some $x\in\mathbb{R}$ such that $H(x)=s$. That means that $f(n)\neq x$ for all $n$ (since $H\circ f(n)\neq s$).
Since there can be no surjection from $\mathbb{N}$ to $\mathbb{R}$, that means that $\mathbb{R}$ is uncountable.
So, as to your questions. First, you should understand that the diagonal argument is applied to a given list. You already have all of $s_1$, $s_2$, $s_3$, etc., in front of you. Nobody is allowed to change them. You construct the "diagonal number" (my $s_f$ above) on the basis of that list. Yes, if you change the list then you can put the diagonal number $s_f$ in the new list; but $s_f$ is only a witness to the fact that the original list was not a list of all sequences. If you change to a different list, then I will have to produce a different witness. The witnesses depend on the given list. You know that $s_4$ is not equal to $s_f$ because $s_f$ is constructed precisely so that it disagrees with $s_4$ in the $4$th position, and one disagreement is enough to guarantee inequality.
Wikipedia's presentation seems to argue by contradiction; I don't like to introduce that into these discussions because the argument is difficult enough to "grok" without the added complication. (The "Otherwise..." part is an argument by contradiction, arguing that if you could 'list' the elements of $T$, then you would apply the argument to show that this 'complete list' is not 'complete', etc). There's no need. Simply, there is no surjection from $\mathbb{N}$ to $T$, as discussed above.
Now, there is a common "first reaction" that this argument would apply "just as well" to the natural numbers: take a list of natural numbers listed in binary, and engineer an argument like the diagonal argument (say, by "reflecting them about the decimal point", so they go off with a tail of zeros to the right; or by writing them from left to right, with least significant digit first, instead of last) to produce a "number" not on the list. You can do that, but the problem is that natural numbers only corresponds to sequences that end with a tail of $0$s, and trying to do the diagonal argument will necessarily product a number that does not have a tail of $0$s, so that it cannot represent a natural number. The reason the diagonal argument works with binary sequences is that $s_f$ is certainly a binary sequence, as there are no restrictions on the binary sequences we are considering.
I hope this helps.
The idea, in a nutshell, is to assume by contradiction that the reals are countable. Due to that you can write them as $a_i$ for $i\in\mathbb N$. This list contains all the real numbers.
By taking the number produced by the diagonal argument you ensure at the $n$-th step that you are not any of $a_1,\ldots,a_n$. When the induction "ends" you have produced a real number which is none of the $a_i,\ i\in\mathbb N$.
This could only mean one thing: the enumeration did not capture all the real numbers like you assumed it to.
Since the enumeration was not specific, but arbitrary, we deduce that any countable enumeration of the reals will not cover the entire real numbers. This, by definition, means that the real numbers are uncountable.
Addendum:
After reading more carefully most of this thread (on its comments, and so on) I think that I should add a few words about proof by contradiction - which is a very common method in mathematics.
"Mathematics is a science of deductions", said my linear algebra prof. on the first day of my B.Sc.
"We assume A,B,C and deduce D,E,F", he continued.
The important thing about the ability to deduce things, is to remain consistent. In simple words it means not to be able to prove both something and its negation. Why is this important? Because it is simple to prove anything from contradiction, in fact according to this xkcd it is even possible to derive phone numbers (although I never managed to do that myself).
Proof by contradiction exploits the assumption that we use a consistent axiomatic system. If we assumed one thing, and derived from it contradiction then our assumption was false.
Since something is either true or false, if it is false then its negation is true.
So what did I do in the proof above? I started by assuming mathematics as we know it is consistent, then added the assumption "The reals are countable". If they are countable, then by definition we can list them as above.
The diagonal argument shows that regardless to how you are going to list them, countably many indices is not enough, and for every list we can easily manufacture a real number not present on it. From this we deduce that there are no countable lists containing all the real numbers. This is to say that the reals are not countable, which contradicts our assumption that they are countable.
I should add that some of the proofs by contradiction do not actually require the contrapositive assumption. In this case, for example, I could have said "given an arbitrary countable list of real numbers we can procure a real number not on the list" and deduce from that that the real numbers are uncountable. Sometimes, however, contradiction is slightly more essential for the proof.
The diagonal argument is best understood by first examining finite instances. Suppose $\rm\:M_{\:\!i\:j}\:$ is a $\rm\ n \times n\ $ matrix whose entries lie in a set $\rm\:T\:$ with at least two elements. Then one can construct a $\rm\ 1\times n\ $ row $\rm\:R\:$ different from any row $\rm\:M_{\:\!i}\:$ of $\rm\:M\:$ by taking the diagonal $\rm\:M_{\:\!i\: i}\:$ and changing each of its entries, viz. $\rm\:R_{\:\!i} := \neg\: A_{\:\!i\: i},\:$ where $\:\neg\:$ is any "change" function on $\rm\:T,\:$ i.e. $\rm\: \neg\: t \ne t\:$ $\rm\:\forall\:t\in T.\:$ Note that $\rm\:R\:$ is not equal to any row $\rm\:M_{\:\!i}\:$ since its $\rm\:i$'th entry differs, i.e. $\rm\:R_{\:\!i} =\: \neg\: M_{\:\!i\:i}\ne M_{\:\!i\:i}\:.\:$
Viewing each row $\rm\:M_{\:\!i}\:$ as a function $\rm\:j\mapsto M_{\:\!i\: j}\:$ from $\rm\:n = \{0\ 1\:\cdots\: n-1\}\:$ to $\rm\:T,\:$ the proof shows that there are more than $\rm\:n\:$ such functions, i.e. $\rm\:|T^{\:\!n}| > n\:$ if $\rm\:|T|\ge 2\:.\:$ Obviously the proof generalizes from $\rm\:n\:$ to any set $\rm\:S,\:$ yielding that $\rm\:|T^S| > |S|\:.\:$ This illustrates the simple essence of diagonalization - which is due not to Cantor, but to du Bois-Reymond (who used it to diagonalize on growth rates of functions, i.e. "orders of infinity").