What is the difference between K-means clustering and vector quantization?
The way I understand it, K-means is one type of vector quantization.
The K-means algorithms is the specialization of the celebrated "Lloyd I" quantization algorithm to the case of empirical distributions. (cf. Lloyd)
The Lloyd I algorithm is proved to yield a sequence of quantizers with a decreasing quadratic distortion. However, except in the special case of one-dimensional log-concave distributions, it dos not always converge to a quadratic optimal quantizer. (There are local minimums for the quantization error, especially when dealing with empirical distribution i.e. for the clustering problem.)
A method that converges (always) toward an optimal quantizer is the so-called CLVQ algorithms, which also generalizes to the problem of more general L^p quantization. It is a kind of Stochastic Gradient method. (cf. Pagès)
There are also some approaches based on genetic algorithms. (cf. Hamida et al.), and/or classical optimization procedures for the one dimensional case that converge faster (Pagès, Printems).