Why are non-linear optics called non-linear?
Nonlinear optical elements are called nonlinear precisely because of the behaviour you note: because the optical response of the material does not depend linearly on the driving fields. The response may then have a quadratic or higher dependence on the driver, which is usually written in the form
$$ \mathbf P =\varepsilon_0 \chi^{(1)} \mathbf E + \varepsilon_0 \overleftrightarrow\chi^{(2)}· \mathbf E^{\otimes 2} + \varepsilon_0 \overleftrightarrow\chi^{(3)}· \mathbf E^{\otimes 3} + \cdots. \tag 1 $$
(Note, however, that if the intensity is too high then even this perturbative expansion may break, as is the case in high-order harmonic generation.)
The reason nonlinear optics is usually framed in terms of frequency-mixing processes is that the higher-order powers do exactly that. For example, if you have a sinusoidal driver $E=E_0\cos(\omega t)$, then a response that depends on $E^2$ will introduce other frequencies, since $$ E^2=\tfrac{E_0^2}{2}(1+\cos(2\omega t)). \tag2 $$ The first term is known as optical rectification, and the second term is second harmonic generation. Terms of higher order can produce further mixing of components.
It is important to contrast this with linear optics, for which each frequency component is on its own. Linear optical elements will never add a frequency component that is not already there, and they will never modify one frequency based on what's happening with another one. (You might even call linear optics boring.) Nonlinear optics allow us to break free from that, which is why so much of the field is focused on the frequency-mixing characteristics of the different processes.
So, you have two different approaches for understanding the field, in terms of the nonlinear order of the term involved or in terms of the ways it can mix frequencies. The photon picture arises as an amalgam of these two, and it emerges by doing a Feynman-style diagrammatic expansion of the terms in the perturbation series.
It is important to note that this 'photon' picture does not require the field to be quantized to work, and it is equally applicable to a classical field. When you do go to a quantized field, however, the positive/negative frequency components in expansions of the form $e^{-i\omega t}+e^{+i\omega t}$ get replaced by quadratures of the form $\hat a + \hat a^\dagger$, each of which adds or substracts a photon from the field. If you have a higher power of $(e^{-i\omega t}+e^{+i\omega t})$ then you have a bigger product with more operators and therefore more photons in the interaction.
Non-Linearity means that the dispersion relation becomes non-linear. Linearity is an assumption which only holds true for low intensities. Almost every material has some non-linear effects if the light source is only powerful enough. The polarization vector for example becomes:
$P = P_0 + \varepsilon_0 \chi^{(1)} E + \varepsilon_0 \chi^{(2)} E^2 + \varepsilon_0 \chi^{(3)} E^3 + \cdots. $
this can lead to self-induced effects like the Kerr-Effect, where the refractive index becomes an function of the intensity of the light source itself $\rightarrow n_{Kerr}=n_0+n_1(I)$.
An application for this would be in a Ti-sapphire laser for mode locking. I highly recommend Chapter 19 in Saleh/Teich - Fundamentals of Photonics about this topic