Why use the Kronecker product?
It is often very useful when one is solving or optimizing a function where the unknown is a matrix. This is because of the following relationship between the Kronecker product $\otimes$ and the vectorization operator $\operatorname{vec}(\cdot)$ that takes a matrix and unwinds it into a long vector: $$\operatorname{vec}(\underbrace{AXB}_{\text{matrices}}) = \underbrace{(B^T \otimes A)}_{\text{matrix}}\underbrace{\operatorname{vec}(X)}_\text{vector}$$ For example, if you want to solve the matrix equation $$AXB + X = C,$$ you can convert it to the following linear system: $$(B^T \otimes A + I)\operatorname{vec}(X) = \operatorname{vec}(C).$$ More generally, matrices have both multiplicative (operator) and additive (vector space) structures, and combination of Kronecker products and vectorization provide the algebraic framework for converting back and forth between these contexts.
The Kronecker product is relied on incessantly in the study of distribution of test statistics in ANOVA and design of experiments. It is used constantly in a plethora of ways in the theory of the Wishart distribution. In those topics I am rusty. I have also seen it used in biochemistry, but in that topic I never went through all the details.