Image convolution with even-sized kernel
If I understand your question correctly, then for even sized kernels you are correct that it is the convention to centre the kernel so that there is one more sample before the new zero.
So, for a kernel of width 4, the centred indices will be -2 -1 0 +1
as you say above.
However, this really is just a convention - an asymmetric convolution is very rarely used anyway and the exact nature of the asymmetry (to the left/right etc.) has no relation to the "correct" result. I would imagine that the reason that most implementations behave this way is so that they can give comparable results given the same inputs.
When performing the convolution in the frequency domain, the kernel is padded to match the image size anyway, and you've already stated that you are performing the convolution in the spatial domain.
I'm much more intrigued as to why you need to use an even sized kernel in the first place.
The correct answer is to return the results pixel in the upper left corner, regardless whether your matrix is evenly sized or not. Then you can simply perform the operation in a simple scanline, and they require no memory.
private static void applyBlur(int[] pixels, int stride) {
int v0, v1, v2, r, g, b;
int pos;
pos = 0;
try {
while (true) {
v0 = pixels[pos];
v1 = pixels[pos+1];
v2 = pixels[pos+2];
r = ((v0 >> 16) & 0xFF) + ((v1 >> 16) & 0xFF) + ((v2 >> 16) & 0xFF);
g = ((v0 >> 8 ) & 0xFF) + ((v1 >> 8) & 0xFF) + ((v2 >> 8) & 0xFF);
b = ((v0 ) & 0xFF) + ((v1 ) & 0xFF) + ((v2 ) & 0xFF);
r/=3;
g/=3;
b/=3;
pixels[pos++] = r << 16 | g << 8 | b;
}
}
catch (ArrayIndexOutOfBoundsException e) { }
pos = 0;
try {
while (true) {
v0 = pixels[pos];
v1 = pixels[pos+stride];
v2 = pixels[pos+stride+stride];
r = ((v0 >> 16) & 0xFF) + ((v1 >> 16) & 0xFF) + ((v2 >> 16) & 0xFF);
g = ((v0 >> 8 ) & 0xFF) + ((v1 >> 8) & 0xFF) + ((v2 >> 8) & 0xFF);
b = ((v0 ) & 0xFF) + ((v1 ) & 0xFF) + ((v2 ) & 0xFF);
r/=3;
g/=3;
b/=3;
pixels[pos++] = r << 16 | g << 8 | b;
}
}
catch (ArrayIndexOutOfBoundsException e) { }
}