What does "key with length of x bits" mean?
For symmetric algorithms (symmetric encryption, Message Authentication Code), a key is a sequence of bits, such that any sequence of the right length is a possible key. For instance, AES is a symmetric encryption algorithm (specifically, a block cipher) which is defined over keys of 128, 192 and 256 bits: any sequence of 128, 192 or 256 bits can be used as a key. How you encode these bits is not relevant here: regardless of whether you just dump them raw (8 bits per byte), or use Base64, or hexadecimal, or infer them from a character string, or whatever, is up to you.
There are a few gotchas with some algorithms. The prime example is DES, a predecessor to AES. DES is defined to use a 64-bit key. However, if you look at the algorithm definition, you see that only 56 of these bits are used; the other 8 are simply ignored (if you number bits from 1 to 64, these are bits 8, 16, 24, 32, 40, 48, 56 and 64; they are supposed to be "parity bits" depending on the 56 others, but nobody really bothers with setting or checking them). So it is often said that DES has a 56-bit key. This is because key length is related to security: if an algorithm accepts keys of length n bits, then there are 2n possible keys, and thus trying them all (attack known as "exhaustive search" or "brute force") has time proportional to 2n (with n big enough, i.e. more than about 85, this is technologically infeasible). In that sense, DES offers the security of a 56-bit key (256 "really distinct" possible keys). Yet, if you use a library implementing DES, that library will expect DES keys as sequences of 64 bits (often provided as 8 bytes).
Another algorithm with a special rule is RC2. It accepts keys of length 8 to 128 bits (multiple of 8 only); but it also has an extra parameter called effective key length denoted by "T1". In the middle of the processing of the key, an internal value is "reduced" to a sequence of T1 bits, which means that subsequent encryption will depend only on the values of T1 specific bits in that internal value. The resistance of RC2 against exhaustive search is then no more than that offered by a T1-bit key, because one can try all possible sequences of T1 bits for that internal value. Yet, RC2 still has an actual key length (the length of the sequence of bits which is provided as key) which can be greater than T1.
For asymmetric algorithms (also known as public-key cryptography, encompassing asymmetric encryption, digital signatures, some key exchange protocols, and a few more esoteric algorithms), keys work by pairs consisting in a public key and a private key. These keys are mathematical objects with some heavy internal structure. The "key length" is then a conventional measure of the size of one of the involved mathematical objects.
For instance, a RSA public key contains a big integer called the modulus, as well as an other integer (usually small) called the public exponent. When we say a "1024-bit RSA key", we mean that the modulus has length 1024 bits, i.e. is an integer greater than 21023 but lower than 21024. Such an integer could be encoded as a sequence of 1024 bits, i.e. 128 bytes. Yet, the public key must also contain the public exponent, so the actual encoded length will be greater. And the private key is, on a theoretical point of view, knowledge of how the modulus can be factored in prime numbers; the traditional encoding for that knowledge is that of the prime factors, along with a bunch of helper values which could be recomputed from the factors (but that would be slightly expensive) and may help in executing the algorithm faster.
For distinct key types which work over distinct mathematics, other "lengths" are used, so you cannot directly compare security of algorithms by simply comparing the key lengths. A 256-bit ECDSA key is vastly more secure than a 768-bit RSA key. Also, the mathematical structure inherent to public/private key pairs allows for much faster attacks than simply trying out bunch of random bits, and there are many subtle details. See this site for explanations and online calculators for the various set of rules about comparing key sizes that many regulatory organizations have come up with.
Consider the simple encryption of shifting every letter one to the right in the alphabet, so that A becomes B, B becomes C, and so forth. So that: HELLO becomes encrypted as: IFMMP
I can shift one letter to the right, two letters to the right, or up to 25 letters to the right. Consider the "number of letters to the right" as the "key". It takes 5 bites to hold a number between 0 and 25. Therefore, the key size is "5 bits".
Or, instead of shifting to the right, let's say that we randomly re-order the alphabet, and do a one-to-one mapping: ABCDEFGHIJKLMNOPQRSTUVWXYZ RJQFGSKPBTODUZLNHYAVXEMWIC In this case, we look up the letter 'H' in the table, see that it gets transformed into the letter 'P', and that the message: HELLO becomes: PGDDL
Let's say that the random reordering of the alphabet is done with a mathematical sequence based on a 5 bit number, such that every number generates a different sequence. Thus, the 5-bit number is again the "key".
The problem with a 5 bit key is that it only has 32 combinations. If I know the encryption algorithm, I can all 32 keys until I find the right combination. The larger the key, the harder this becomes -- EXPONENTIALLY. A 6 bit key has 64 combinations, a 7 bit key has 128 combinations, and so forth. A 10 bit key has a thousand combinations, a 20 bit key has a million combinations, a 30 bit key has a BILLION combinations.
Let's say that you have a computer that can test a billion keys per second trying to brute force all combinations. That means you can break a 30 bit key in just one second. But, that means it will take you a billion seconds (or 34 years) to break a 60 bit key.
Every 30 bits we add makes it a billion times more difficult. A spy agency like the NSA can crack 60 bit keys using supercomputers, but a 90 bit key is a billion times more difficult to crack, and a 120 bit key would be a further billion times more difficult to crack than a 90 bit key.
That's why older WEP (40 bits) and DES (56 bits) are considered obsolete: we can crack them with desktop computers by trying all combinations. That's why modern crypto, such as AES, uses 128 bits. We can't brute force crack those algorithms.
The length of a key in bits nothing more than a specification of its size. A 128 bit key takes up 16 bytes of space -- just the raw bits that the processor uses. There is nothing special, no translation.
An encoding is a mapping of bits to something that has meaning. For example, the 8 bit sequence 01100001 (0x61) in ASCII is the letter "a". Because keys are random data and they have no meaning, encoding only comes into play for translating binary data into something printable, for example BASE64. Thus, instead of "what sequence of bits must I have to write the letter a?" -- which can vary with different encodings, a key is "what's this bit? what's this bit? what's this bit?".