Why does light of high frequency appear violet?
In the 19th century, the physicists Young and Helmholtz proposed a trichromatic theory of color, in which the eye was modeled as three filters with overlapping ranges. This is essentially a physical model of the pigments in the eye, and it predicts the response of the nerve cells at the retina. Helmholtz did related work on sound and timbre. Ca. 1950, Hering, Hurvich, and Jameson proposed significant modifications to the trichromatic theory, called opponent processing. This models a later stage in the processing of the signals, after the retinal response but before the more sophisticated stages of processing in the brain. Both the trichromatic model and opponent processing are needed in order to describe certain phenomena in human color perception.
The complete theory can be modeled by two functions depending on wavelength. I'll call these $RG(\lambda)$ and $BY(\lambda)$. These functions are drawn here. They both oscillate between positive and negative values. For any given pure wavelength $\lambda$, the net result of pigment-filtering plus the later neurological processing produces these two numbers, which can be thought of as the final signals that go on to later processing in the brain. I'm calling them $RG$ and $BY$ for the following reasons. Let's pretend, for the sake of simplicity, that these functions oscillated between -1 and +1. Then the pair $(RG,BY)=(1,0)$ produces the sensation of red, (-1,0) is green, (0,1) is blue, and (0,-1) is yellow. There is various psychological evidence for this model, e.g., no color is perceived as reddish-green or yellowish-blue. Roughly speaking, what seems to be happening is that the eye-brain system is taking differences between signal levels of different cone cells. This sort of makes sense because, for example, the red and green pigments have response curves that overlap a lot, so if you want to place a pure-wavelength color on the spectrum, the difference between them is more a more direct measure of what you want to know than the individual signals.
The $RG$ function actually has two different peaks, one at the red end of the spectrum and one, surprisingly, at the blue end. This implies that by mixing blue and red, you can produce an $(RG,BY)$ pair similar to what you would have gotten with monochromatic violet. If you look at other sources, e.g., this one (figure 3.3), they seem to agree on the secondary short-wavelength peak of the $RG$ function, but the details of how the two functions are drawn at the short wavelengths are different and seem to make for a less convincing explanation of the observed perceptual similarity between violet and a red-blue mixture.
I don't know if there is a valid reductionist explanation of the short-wavelength peak of the $RG$ function. Like a lot of things produced by evolution, it may basically be an accident that got frozen in. However, it's possible that it serves the evolutionary purpose of helping us to distinguish different shades of blue and violet. If the $RG$ function was simply zero over the whole short-wavelength end of the spectrum, then the $BY$ function would be the only information we'd get for those wavelengths. But the $BY$ function has a maximum, simply because the eye's sensitivity to light fades out as you get into the UV. Near this maximum, the ability of the $BY$ function to discriminate between colors becomes zero. In the York University graph, it appears that the short-wavelength extrema of the $RG$ and $BY$ functions are offset from one another, which would allow some color discrimination in this region. The physical information being preserved by the $BY$ function would then be the difference in response between the blue and green cones. But the Briggs graphs don't appear to show any such offset of the extrema, so it's possible that the explanation I'm giving is a bogus "just-so story."
There may be a good analogy here with sound. The sound spectrum is linear, but there is a psychological phenomenon of octave identification, which makes the spectrum "wrap around," so that frequencies $f$ and $2f$ are perceptually similar and can often be mistaken for one another even by trained musicians. Similarly, the predictive power of the "color wheel" model shows that to some approximation we can think of the trichromatic/opponent process model as resulting in a wrapping around of the visible segment of the EM spectrum into a circle. But in both cases, the wrap-around is only an approximation. In terms of pitch, $f$ and $2f$ are perceptually similar but not indistinguishable. For color, we have the 1976 CIELUV color color diagram, which is a modification of the 1931 diagram meant to represent at least somewhat accurately the degree of perceptual similarity between different points based on the distance between them. The monochromatic spectrum constitutes part of the outer boundary of this diagram, and is more of a "V" than a circle; there is quite a large gap between monochromatic violet and monochromatic red.
It is trivially true that any such diagram has a boundary that is a closed curve. If the diagram is not constrained to give any accurate depiction of the sizes of the perceptual differences between colors, then it can be distorted arbitrarily, and we can arbitrarily define it such that its boundary is a circle. In this sense, the success of the color wheel model is guaranteed, and it follows from nothing more than the fact that humans are trichromats, so that the color space is three-dimensional, and controlling for luminance produces a two-dimensional space. But this fails to explain why there is some degree of perceptual similarity between the red and violet ends of the monochromatic spectrum; for that you need the opponent processing model.
There is also a slight variation in the absorbance of the pigment in the red cones at the blue end of the spectrum. I don't think this is sufficient to explain the perceptual similarity between violet and red, or the even closer similarity between violet and a mixture of red and blue light, i.e., I don't think you can explain these facts using only the trichromatic theory without opponent processing. The classic direct measurements of the filter curves of cone-cell pigments were done with cone cells from carp by Tomita ca. 1965, but AFAIK the only direct measurement using human cone cells was Bowmaker 1981. Bowmaker's red-cell absorbance curve has a very slight rise at short wavelengths, but it's not very pronounced at all. You will see various other curves on the internet, often without any attribution or explanation of where they came from, and some of these show a much more pronounced bump rather than Bowmaker's slight rise. Possibly some of these are from people using the CIE 1931 curves, which were never intended to be physical models of the actual human cone-cell pigments. It should be clear, however, that the red and green pigments' curves must have some variation near the violet end of the spectrum. If they did not, then the dimensionality of the color space would be reduced there, and the human eye would be unable to distinguish different wavelengths in this region, which is contrary to fact.
Bowmaker, "Visual pigments and colour vision in man and monkeys," J R Soc Med. 1981 May; 74(5): 348, freely accessible at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1438839/