How to show that $\det(AB) =\det(A) \det(B)$?
Let's consider the function $B\mapsto \det(AB)$ as a function of the columns of $B=\left(v_1|\cdots |v_i| \cdots | v_n\right)$. It is straight forward to verify that this map is multilinear, in the sense that $$\det\left(A\left(v_1|\cdots |v_i+av_i'| \cdots | v_n\right)\right)=\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)+a\det\left(A\left(v_1|\cdots |v_i'| \cdots | v_n\right)\right).$$ It is also alternating, in the sense that if you swap two columns of $B$, you multiply your overall result by $-1$. These properties both follow directly from the corresponding properties for the function $A\mapsto \det(A)$.
The determinant is completely characterized by these two properties, and the fact that $\det(I)=1$. Moreover, any function that satisfies these two properties must be a multiple of the determinant. If you have not seen this fact, you should try to prove it. I don't know of a reference online, but I know it is contained in Bretscher's linear algebra book.
In any case, because of this fact, we must have that $\det(AB)=c\det(B)$ for some constant $c$, and setting $B=I$, we see that $c=\det(A)$.
The proof using elementary matrices can be found e.g. on proofwiki. It's basically the same proof as given in Jyrki Lahtonen 's comment and Chandrasekhar's link.
There is also a proof using block matrices, I googled a bit and I was only able to find it in this book and this paper.
I like the approach which I learned from Sheldon Axler's Linear Algebra Done Right, Theorem 10.31. Let me try to reproduce the proof here.
We will use several results in the proof, one of them is - as far as I can say - a little less known. It is the theorem which says, that if I have two matrices $A$ and $B$, which only differ in $k$-th row and other rows are the same, and the matrix $C$ has as the $k$-th row the sum of $k$-th rows of $A$ and $B$ and other rows are the same as in $A$ and $B$, then $|C|=|B|+|A|$.
Geometrically, this corresponds to adding two parallelepipeds with the same base.
Proof. Let us denote the rows of $A$ by $\vec\alpha_1,\ldots,\vec\alpha_n$. Thus $$A= \begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n}\\ a_{21} & a_{22}& \ldots & a_{2n}\\ \vdots & \vdots& \ddots & \vdots \\ a_{n1} & a_{n2}& \ldots & a_{nn} \end{pmatrix}= \begin{pmatrix} \vec\alpha_1 \\ \vec\alpha_2 \\ \vdots \\ \vec\alpha_n \end{pmatrix}$$
Directly from the definition of matrix product we can see that the rows of $A.B$ are of the form $\vec\alpha_kB$, i.e., $$A.B=\begin{pmatrix} \vec\alpha_1B \\ \vec\alpha_2B \\ \vdots \\ \vec\alpha_nB \end{pmatrix}$$ Since $\vec\alpha_k=\sum_{i=1}^n a_{ki}\vec e_i$, we can rewrite this equality as $$A.B=\begin{pmatrix} \sum_{i_1=1}^n a_{1i_1}\vec e_{i_1} B\\ \vdots\\ \sum_{i_n=1}^n a_{ni_n}\vec e_{i_n} B \end{pmatrix}$$ Using the theorem on sum of determinants multiple times we get $$ |{A.B}|= \sum_{i_1=1}^n a_{1i_1} \begin{vmatrix} \vec e_{i_1}B\\ \sum_{i_2=1}^n a_{2i_2}\vec e_{i_2} B\\ \vdots\\ \sum_{i_n=1}^n a_{ni_n}\vec e_{i_n} B \end{vmatrix}= \ldots = \sum_{i_1=1}^n \ldots \sum_{i_n=1}^n a_{1i_1} a_{2i_2} \dots a_{ni_n} \begin{vmatrix} \vec e_{i_1} B \\ \vec e_{i_2} B \\ \vdots \\ \vec e_{i_n} B \end{vmatrix} $$
Now notice that if $i_j=i_k$ for some $j\ne k$, then the corresponding determinant in the above sum is zero (it has two identical rows). Thus the only nonzero summands are those one, for which the $n$-tuple $(i_1,i_2,\dots,i_n)$ represents a permutation of the numbers $1,\ldots,n$. Thus we get $$|{A.B}|=\sum_{\varphi\in S_n} a_{1\varphi(1)} a_{2\varphi(2)} \dots a_{n\varphi(n)} \begin{vmatrix} \vec e_{\varphi(1)} B \\ \vec e_{\varphi(2)} B \\ \vdots \\ \vec e_{\varphi(n)} B \end{vmatrix}$$ (Here $S_n$ denotes the set of all permutations of $\{1,2,\dots,n\}$.) The matrix on the RHS of the above equality is the matrix $B$ with permuted rows. Using several transpositions of rows we can get the matrix $B$. We will show that this can be done using $i(\varphi)$ transpositions, where $i(\varphi)$ denotes the number of inversions of $\varphi$. Using this fact we get $$|{A.B}|=\sum_{\varphi\in S_n} a_{1\varphi(1)} a_{2\varphi(2)} \dots a_{n\varphi(n)} (-1)^{i(\varphi)} |{B}| =|A|.|B|.$$
It remains to show that we need $i(\varphi)$ transpositions. We can transform the "permuted matrix" to matrix $B$ as follows: we first move the first row of $B$ on the first place by exchanging it with the preceding row until it is on the correct position. (If it already is in the first position, we make no exchanges at all.) The number of transpositions we have used is exactly the number of inversions of $\varphi$ that contains the number 1. Now we can move the second row to the second place in the same way. We will use the same number of transposition as the number of inversions of $\varphi$ containing 2 but not containing 1. (Since the first row is already in place.) We continue in the same way. We see that by using this procedure we obtain the matrix $B$ after $i(\varphi)$ row transpositions.
Let $K$ be the ground ring. The statement holds
(a) when $B$ is diagonal,
(b) when $B$ is strictly triangular,
(c) when $B$ is triangular (by (a) and (b)),
(d) when $A$ and $B$ have rational entries and $K$ is an extension of $\mathbb Q$ containing the eigenvalues of $B$ (by (c)),
(e) when $K=\mathbb Q$ (by (d)),
(f) when $K=\mathbb Z[a_{11},\dots,a_{nn},b_{11},\dots,b_{nn}]$, where the $a_{ij}$ and $b_{ij}$ are respectively the entries of $A$ and $B$, and are indeterminate (by (e)),
(g) always (by (f)).
The reader who knows what the discriminant of a polynomial in $\mathbb Q[X]$ is, can skip (b) and (c).
Reference: this MathOverflow answer of Bill Dubuque.
EDIT 1. The principle underlying the above argument has various names. Bill Dubuque calls it "universality" principle. Michael Artin calls it "The Principle of Permanence of Identities". The section of Algebra with this title can be viewed here. I strongly suggest reading this section to those who are not familiar with this. It is an interesting coincidence that the illustration chosen by Artin is precisely the multiplicativity of determinants.
Another highly important application is the proof of the Cayley-Hamilton Theorem. I will not give it here, but I will digress on another point. That is, I will try to explain why
(*) it suffices to prove Cayley-Hamilton or the multiplicativity of determinants in the diagonal case.
Suppose we have a polynomial map $f:M_n(\mathbb Z)\to\mathbb Z$. Then $f$ is given by a unique element, again denoted $f$, of $\mathbb Z[a_{11},\dots,a_{nn}]$, where the $a_{ij}$ are indeterminates (because $\mathbb Z$ is an infinite domain). As a result, given any $A$ in $M_n(K)$ for any commutative ring $K$, we can define $f_K(A)$ by mapping the indeterminate $a_{ij}$ to the corresponding entry of $A$. That is the Principle of Permanence of Identities. The key to prove (*) will be:
LEMMA 1. Let $f:M_n(\mathbb Z)\to\mathbb Z$ be a polynomial map vanishing on the diagonalizable matrices. Then $f$ vanishes on all matrices.
There are at least two ways to prove this. The reader will perhaps prefer the first one but, (IMHO) the second one is better.
First way: It suffices to prove that the polynomial map $f_{\mathbb C}:M_n(\mathbb C)\to\mathbb C$ is zero. Thus it suffices to prove that the diagonalizable matrices are dense in $M_n(\mathbb C)$. But this is clear since any $A\in M_n(\mathbb C)$ is similar to a triangular matrix $T$, and the diagonal entries of $T$ (which are the eigenvalues of $A$) can be made all distinct by adding an arbitrarily small diagonal matrix.
Second way. Consider again the ring $R:=\mathbb Z[a_{11},\dots,a_{nn}]$, where the $a_{ij}$ are indeterminates. Let $A$ in $M_n(R)$ be the matrix whose $(i,j)$ entry is $a_{ij}$. Let $\chi\in R[X]$ be the characteristic polynomial of $A$, and let $u_1,\dots,u_n$ be the roots of $\chi$ (in some extension of the fraction field of $R$).
LEMMA 2. The expression $$\prod_{i < j}\ (u_i-u_j)^2$$ defines a unique nonzero element of $d\in R$, called the discriminant of $\chi$.
Lemma 2 implies Lemma 1 because $R$ is a domain and because we have $fd=0$ since $f$ vanishes on the diagonalizable matrices, whereas $d$ vanishes on the non-diagonalizable matrices.
Lemma 2 is a particular case of a theorem which says that, given any monic polynomial $g$ in one indeterminate and coefficients in a field, any polynomial in the roots of $g$ which is invariant under permutation is a polynomial in the coefficients of $g$. More precisely:
Let $A$ be a commutative ring, let $X_1,\dots,X_n,T$ be indeterminates, and let $s_i$ be the degree $i$ elementary symmetric polynomial in $X_1,\dots,X_n$. Recall that the $s_i$ are defined by $$ f(T):=(T-X_1)\cdots(T-X_n)=T^n+\sum_{i=1}^n\ (-1)^i\ s_i\ T^{n-i}. $$ We abbreviate $X_1,\dots,X_n$ by $X_\bullet$, and $s_1,\dots,s_n$ by $s_\bullet$. Let $G$ the group of permutations of the $X_i$, and $A[X_\bullet]^G\subset A[X_\bullet]$ the fixed ring. For $\alpha\in\mathbb N^n$ put $$ X^\alpha:=X_1^{\alpha_1}\cdots X_1^{\alpha_1},\quad s^\alpha:=s_1^{\alpha_1}\cdots s_1^{\alpha_1}. $$ Write $\Gamma$ for the set of those $\alpha\in\mathbb N^n$ which satisfy $\alpha_i<i$ for all $i$, and put $$ X^\Gamma:=\{X^\alpha\ |\ \alpha\in\Gamma\}. $$
FUNDAMENTAL THEOREM OF SYMMETRIC POLYNOMIALS. The $s_i$ generate the $A$-algebra $A[X_\bullet]^G$.
PROOF. Observe that the map $u:\mathbb N^n\to\mathbb N^n$ defined by $$ u(\alpha)_i:=\alpha_i+\cdots+\alpha_n $$ is injective. Order $\mathbb N^n$ lexicographically, note that the leading term of $s^\alpha$ is $X^{u(\alpha)}$, and argue by induction on the lexicographical ordering of $\mathbb N^n$.
EDIT 2.
Polynomial Identities
Michael Artin writes:
It is possible to formalize the above discussion and to prove a precise theorem concerning the validity of identities in an arbitrary ring. However, even mathematicians occasionally feel that it isn't worthwhile making a precise formulation---that it is easier to consider each case as it comes along. This is one of those occasions.
I'll disobey and make a precise formulation (taken from Bourbaki). If $A$ is a commutative ring and $T_1,\dots,T_k$ are indeterminates, let us denote the obvious morphism form $\mathbb Z[T_1,\dots,T_k]$ to $A[T_1,\dots,T_k]$ by $f\mapsto\overline f$.
Let $X_1,\dots,X_m,Y_1,\dots,Y_n$ be indeterminates.
Let $f_1,\dots,f_n$ be in $\mathbb Z[X_1,\dots,X_m]$.
Let $g$ be in $\mathbb Z[Y_1,\dots,Y_n]$.
The expression $g(f_1,\dots,f_n)$ denotes then a well-defined polynomial in $\mathbb Z[X_1,\dots,X_m]$.
If this polynomial is the zero polynomial, say that $(f_1,\dots,f_n,g)$ is an $(m,n)$-polynomial identity.
The "theorem" is this:
If $(f_1,\dots,f_n,g)$ is an $(m,n)$-polynomial identity, and if $x_1,\dots,x_m$ are in $A$, where $A$ is any commutative ring, then $$g(f_1(x_1,\dots,x_m),\dots,f_n(x_1,\dots,x_m))=0.$$
Exercise: Is $$(X_1^3-X_2^3,X_1-X_2,X_1^2+X_1X_2+X_2^2,Y_1-Y_2Y_3)$$ a $(2,3)$-polynomial identity?
Clearly, the multiplicativity of determinants and the Cayley-Hamilton can be expressed in terms of polynomial identities in the above sense.
Exterior Algebras
To prove the multiplicativity of determinants, one can also proceed as follows.
Let $A$ be a commutative ring and $M$ an $A$-module. One can show that there is an $A$-algebra $\wedge(M)$, called the exterior algebra of $M$ [here "algebra" means "not necessarily commutative algebra"], and an $A$-linear map $e_M$ from $M$ to $\wedge(M)$ having the following property:
For every $A$-linear map $f$ from $M$ to an $A$-algebra $B$ satisfying $f(x)^2=0$ for all $x$ in $M$, there is a unique $A$-algebra morphism $F$ from $\wedge(M)$ to $B$ such that $F\circ e_M=f$.
One can prove $e_M(x)^2=0$ for all $x$ in $M$. This easily implies that $\wedge$ is a functor from $A$-modules to $A$-algebras.
Let $\wedge^n(M)$ be the submodule of $\wedge(M)$ generated by the $e_M(x_1)\cdots e_M(x_n)$, where the $x_i$ run over $M$. Then $\wedge^n$ is a functor from $A$-modules to $A$-modules.
One can show that the $A$-module $\wedge^n(A^n)$ is isomorphic to $A$. For any endomorphism $f$ of $A^n$, one defines $\det(f)$ as being $\wedge^n(f)$. The multiplicativity is then obvious.