Understanding conversion and (de)compression of lossless audio

First:

Understand the difference between an encoding and a container format. http://en.wikipedia.org/wiki/Digital_container_format

A container format is a data format that "encapsulates" other encoded data. It often contains "meta-information" about the encoded data, or has a way to store multiple separate streams of encoded data, or something like that.

An encoding, produced by a codec, is the actual "meat" of the data stream.

The most common example I can think of is the format "Ogg/Vorbis". Ogg is the container format, and Vorbis is the encoding. So you have an Ogg-formatted file and inside of there are these little buckets that contain encoded data. Within each bucket is a Vorbis-encoded data stream and nothing else. On the bucket might be stamped the name of the artist and the song title, for instance.

So back to tech:

  1. If you have music in a lossy format already, such as mp3 or ogg/vorbis, converting it to a lossless format will only eat up (a lot of) disk space, and WILL NOT -- absolutely WILL NOT -- improve the quality of the audio whatsoever. You can't create fidelity once it's already been lost. Unless you're writing a GUI interface in Visual Basic on some hit TV show called CSI, but that's fantasy, not reality.

  2. If you have music in other lossless formats and you want to convert it to FLAC, you can do so.

  3. Be careful with throwing around the term "WAV". Wav doesn't HAVE to be lossless; actually, WAV is just a container for various possible formats. It's kind of like AVI in that sense. You CAN have a lossless WAV if it's just raw PCM data, but you can also embed MPEG-1 Layer III data (lossy) into a WAV file.

  4. It is possible to lose data when converting from one lossless format to another, if you reduce the fidelity of the data. For example, if you convert an unsigned 16-bit PCM data stream at 48000 Hz into an 8-bit PCM data stream at 44100 Hz, you're losing fidelity in two ways: the samples are being merged from 48000 down to only 44100 per second (resulting in loss of data), and the data has to be down-mixed to fit the information into only 8 bits instead of 16 per sample, which will dramatically hurt the quality.

Every digital audio stream, even those encoded by a compressing (lossy or lossless) encoder, have the following "Sample Format Properties", which are essential elements that describe the properties of the stream:

  1. Sample bit width and bit depth, i.e. 8 bit, 16 bit, etc. Bit width and depth are subtly different, and there's also little-endian/big-endian (which does not affect quality) and signed or unsigned (which also does not affect quality, but does affect how the encoder/decoder deals with the data). The key point to remember is that "more bits is better". So 32-bit is better than 16-bit, etc.

  2. Frequency, also known as sampling rate. More is better because you have more "samples" of audio being played back per second. Imagine quickly brushing your finger over a deck of cards and watching the cards go by in a blur -- that's how digital audio essentially happens. Each sample is a card, and if you have more cards flying by per second, the audio is more seamless. Like, you would really notice if you were only flipping 5 cards per second, but it would all blur together if you are flipping thousands of cards per second. So more is better, because it's more natural and closer to reality, which is analogue and infinitely divisible (well, down to the Planck units but that's debatable and off topic).

"Lossless" just means that if you use the same or better sample format in the output as you used in the input, you won't lose any data.

So if you go from 16 bit to 32 bit sample format, you don't lose data. But if you go from 32 bit to 16 bit, you do lose data.

So the answer to your question of whether using FLAC makes sense depends on the source data: If you have 64-bit WAV files that were originally recorded at that sample format, with 192000 Hz (a.k.a 192KHz), and you convert them to a "standard" FLAC sample format of 16-bit and 44.1 KHz, you are going to lose a TON of data. But if your WAV file is 8-bit with only 22100 samples per second and you convert it to a 16-bit FLAC with 44100 samples per second, you are not going to lose data. And you may even end up increasing the file size, depending on whether the lossless compression or the smaller sample format wins out.

Sample format will affect how much space the file takes up, so the "bigger" bits and "faster" sampling rates will occupy more space.

As far as practical concerns and the human ear: you will not really notice if you convert really high-fidelity originals down to 16-bit 44.1KHz FLAC. But neither will you notice an improvement if you convert an MP3 to FLAC. So you need to evaluate what sample format your source data is in before you decide what to do.

Now that I've provided you all this information, here are my direct and point-blank, zero-explanation answers to your questions:

I have a few questions concerning lossless audio. I'm considering ripping my entire music collection to lossless .flac, but I want to understand a few things about it first.

If your music collection is on CDs and you want to rip it to FLAC, that's a very good fit in my opinion. The CD quality audio will be at 44.1 KHz and at 16 bits per sample. This matches up exactly with FLAC's default settings (at least, the defaults in the encoders I use). Therefore you will not lose any data and it will be mathematically identical to the input data when decoded.

If I have a file that is .flac, and I want to make it into, say, .wav, how can I do this to not have any quality loss? If I decompress it, I know that I won't lose quality. Is converting .flac to .wav the same as decompressing?

You can convert it to a .wav file with the same or a wider sample format than the input data and you won't have any quality loss.

When a media player plays the audio in your flac file, it is essentially decoding the flac data to a PCM format prior to sending that PCM data to the sound card. It will decompress it to the exact same data that went in; so if 16-bit 44.1 KHz PCM data went in, that's what'll come out, and go to your speakers.

The only difference between this activity and converting the audio to a WAV file is that, when you convert it to a WAV file, it has to create a WAV container with the appropriate filler bits, etc. and it also lets you choose the sample format of the WAV file. But assuming that the sample format is the same, then the only difference between your FLAC and WAV files will be the file size: the WAV files will be substantially larger.

Does this also apply to .ape format as well? I have a few public domain recordings that I have downloaded in .ape, but I want to make it .flac. Would .ape to .flac be possible without using .wav as a middle man. I want to ensure that not even a bit is lost in any way.

No, it's not possible to do this without using some PCM format as a middleman. But yes, it is possible to do it without using a WAV file. Note the difference. PCM datastream. WAV file. If the distinction isn't clear to you, re-read the beginning of my post. If you want to ensure that "not even a bit" is lost, then you need to examine your APE files and understand what sample format they're in, and make sure that your FLAC encoder is set to encode for the same settings.

Internally, any audio converting program is going to be decoding from the source format to some sort of lossless PCM sample format, and then taking those PCM samples and re-encoding them in the destination format.

Also, if there are any guides that explain the world of lossless flawlessly, would anyone be willing to share a link? :)

In my opinion, the best way to learn is to do. If you take time to learn the Gstreamer framework, how to use it, and the meanings of the various tunables of the sample format, and how to construct a pipeline, you will come out with a really strong understanding of digital audio. Check it out. http://gstreamer.freedesktop.org/ You can also grab the Gstreamer SDK from http://code.entropywave.com/gstreamer-sdk/ (Windows is supported) and experiment with gst-launch-0.10 without having to compile anything from source.

A few revealing things:

gst-inspect-0.10 vorbisenc gst-inspect-0.10 vorbisdec gst-inspect-0.10 audioconvert gst-inspect-0.10 audioresample

Then learn about caps, constructing pipelines, etc and you'll be moving right along.

*Note: I realize I didn't explain what PCM is. Wikipedia does a better job of this than I do: http://en.wikipedia.org/wiki/Pulse-code_modulation