What is the difference between 32-bit PAE and 64-bit kernels?
The kernel sees the physical memory and provides a view to the processes. If you ever wondered how a process can have a 4 GB memory space if your whole machine got only 512 MB of RAM, that's why. Each process has its own virtual memory space. The addresses in that address space are mapped either to physical pages or to swap space. If to swap space, they'll have to be swapped back into physical memory before your process can access a page to modify it.
The example from Torvalds in XQYZ's answer (DOS highmem) is not too far fetched, although I disagree about his conclusion that PAE is generally a bad thing. It solved specific problems and has its merits - but all of that is argumentative. For example the implementer of a library may not perceive the implementation as easy, while the user of that library may perceive this library as very useful and easy to use. Torvalds is an implementer, so he's bound to say what the statement says. For an end user this solves a problem and that's what the end user cares about.
For one PAE helps solve another legacy problem on 32bit machines. It allows the kernel to map the full 4 GB of memory and work around the BIOS memory hole that exists on many machines and causes a pure 32bit kernel without PAE to "see" only 3.1 or 3.2 GB of memory, despite the physical 4 GB.
Anyway, for the 64bit kernel it's a symmetrical relation between the page physical and the virtual pages (leaving swap space and other details aside). However, the PAE kernel maps between a 32bit pointer within the process' address space and a 36bit address in physical memory. More book-keeping is needed here. Keyword: "Extended Page-Table". But this is somewhat more of a programming question. This is the main difference. More book-keeping compared to a full linear address space. For PAE it's chunks of 4 GB as you mentioned.
Aside from that both PAE and 64bit allow for large pages (instead of the standard 4 KB pages in 32bit).
Chapter 3 of Volume 1 of the Intel Processor Manual has some overview and Chapter 3 of Volume 3A ("Protected Mode Memory Management") has more details, if you want to read up on it.
To me it seems like this is a big distinction that seems to be ignored by many people.
You're right. However, the majority of people are users, not implementers. That's why they won't care. And as long as you don't require huge amounts of memory for your application, many people don't care (especially since there are compatibility layers).
You might want to look into what Linus Torwalds says about it here:
PAE turned that very simple fact on its head, and screwed things up royally. Whoever came up with the idea was totally incompetent, and had forgotten all the DOS HIGHMEM pains. There’s a damn good reason why we left the 286 behind, and started using 386′s, instead of having HIGHMEM crap with windows into a bigger physical space.
[...]
So repeat after me: PAE didn’t ever really fix anything. It was a mistake. It was just a total failure, and the result of hw engineers not understanding software.