Good quality sound using a microcontroller
One would think that single channel for voice would be enough. Remember the old Sprint commercials? "So quiet you can hear a pin drop", that was only single channel. 12 bits at 16 MHz for max 8 KHz bandwidth would be good. Per Wikipedia "In telephony, the usable voice frequency band ranges from approximately 300 Hz to 3400 Hz. It is for this reason that the ultra low frequency band of the electromagnetic spectrum between 300 and 3000 Hz is also referred to as voice frequency, being the electromagnetic energy that represents acoustic energy at baseband. The bandwidth allocated for a single voice-frequency transmission channel is usually 4 kHz, including guard bands, allowing a sampling rate of 8 kHz to be used as the basis of the pulse code modulation system used for the digital PSTN. "
I think several folks, such as Adafruit.com, DFRobot.com, and others make cards with DAC outputs. Or you could get a DAC only card from Gravitech.us
I started a project at Arduino.cc to sample 16 bits at 44.1KHz, store to SD card, then play it back later. I used external SPI ADC and DAC, and member fatlib16 there helped me to code it. You can find it by searching "sample, record, playback later" and it should pull it up.