What is meant by 'On-chip 2-cycle Multiplier' for AVR microcontroller?
It means that the ALU in the microcontroller has a hardware multiplier, which takes two instruction clock cycles to perform a calculation.
This is faster than doing multiplication using software (e.g. adding multiple times in a for loop).
It means it has a hardware multiplier that takes can complete a multiply operation in two instruction cycles.
Some processors can't multiply numbers, they only have addition and bit boolean logic.
This website discusses how to implement multiplication on a 6502 (The old Apple ][ computer, as well as other computers of that era). https://www.lysator.liu.se/~nisse/misc/6502-mul.html
These 8bit x 8bit operations take about 100 cycles per multiplication. You can see that a 2 cycle multiplication (done in hardware) is far superior. You can do simple DSP type calculations if you can do multiplies at half the clock rate.
The best CPUs can still only do 1 multiplication per cycle, because that is the definition of a clock cycle (although they can do many in parallel).
(Now division is a lot harder. Most CPUs take at least N clock cycles to do a division, where N is the number of bits. For some reason, dealing with the "carry" bit in multiplication can be done quickly, while the "carry" bit in division is much more difficult.)