Difference between authentication and identification [Crypto and Security perspective]

In the context of communications through a network, an identity is equivalent to the knowledge of a specific piece of data: summarily said, from the outside, what you can know of a given entity from the outside consists exclusively in what bytes that entity emits, i.e. what it can compute. Since everybody can buy the same kind of PC, differences in computing abilities ultimately lie in what the entities know. E.g. you are different from me, from the StackExchange point of view, only in that you know the password to the 'Jus12' account, and I do not, while I know the password of the 'Tom Leek' account, and you do not.

Identification is about making sure that a given entity is involved and somehow 'active'. For instance, the StackExchange server can make sure that I am alive and kicking by challenging me (my computer) with showing my password. Note that the StackExchange server (actually, another server because they use an indirect scheme, but that's a technicality) also knows my password, so when the SE challenge is successfully responded to, the SE server only knows that, at the other end of the line, operates an entity which is either me or the SE server itself. Identification protocols must take care to avoid or at least reliably detect the occurrence of a server induced to talk to itself by a crafty ill-intentioned individual (hereafter designated by the generic term 'attacker').

Identification, as itself, is quite useless. What the SE server wants to know is not that I, Tom Leek, exists and is awake; the SE server is quite persuaded of that, and does not care. What the SE server wants is to make sure that I approve of the HTTP requests that I am going to issue. They want authentication: that's identification applied to some other data. Thus, identification is useful insofar as it can be believed to apply to a bunch of data, which then as a "verified" provenance. The link between the identity and the data must be resilient with regards to the outrages that the attacker may inflict on the data. In the case of StackExchange, the attacker is supposed to be fairly weak, because integrity of an HTTP request is assumed: the identification part becomes a cookie, as part of a HTTP request, and the SE server just assumes that the attacker cannot alter the request or the cookie, or copy the cookie and graft it upon a new phony HTTP request.

More thorough authentication usually uses a cryptographically strong linkage, e.g. a SSL/TLS tunnel (often as part of HTTPS). The cryptographic properties of the tunnel imply that the server can be sure that it will be talking to the same entity throughout the SSL session; moreover, the server assumes that the user will not play any identification protocol related to his account password unless this happens over a SSL tunnel in which the server is duly authenticated (that is, the client is sure that it talks to the right server -- that's what the server certificate is about -- and that whatever data it send will go to that server only, so it is authentication). Under these conditions, if the server can identify me within that tunnel, then the identification covers whatever data is sent through the tunnel afterwards: the tunnel is the link between identification and data, so this is authentication.

Non-repudiation is a characteristic of some authentication protocols, in which the link between identity and data can be verified not only by whoever is, interactively, at the other end of the line, but also by an ulterior third party, for instance a judge. Password-based schemes do not normally offer that property, because whoever verifies the password must also know it more or less directly, and thus could frame the alleged emitter. Non-repudiation requires mathematics. Note that, in a SSL tunnel, the client authenticates the server through its certificate, which is full of asymmetric cryptography, but this does not grant non-repudiation: the client is sure that whatever data it receives from the server really comes from the server, but there is nothing that the client could record, which would convince a judge that the server really sent that data. To get non-repudiation, you need digital signatures. Without non-repudiation, one can get authentication with algorithms known as Message Authentication Codes, which are computationally lighter. Confusingly, there is a widespread (but incorrect) tradition of calling MAC "signatures".

Summary:

  • Identification: the specific entity E is involved and responding.
  • Integrity: whatever piece of data was received has been sent as-is by some entity E' and was not altered.
  • Authentication: identification and integrity at the same time (E = E').
  • Non-repudiation: authentication that can convince a judge.

Identification is a way to describe the principal, e.g. username, email, First + Last name, etc. The principal is the user.

Authentication is a way to prove that the principal is who they say they are.

So for instance when I log into a system I identify myself with my username (Hi, I'm SteveS), and I authenticate myself by providing a password only I know and the system can validate (I'm SteveS because my password is "foo").

A good real world example is the use of drivers licenses. I can stick a name tag on my shirt and identify that I'm Joe, but if someone needed proof that I'm Joe, I show them my drivers license. Authentication is done by comparing the photo on the license to the person.


My answer is short, but this is how I always read the two.

Identifying is saying "I'm Joe Schmoe!" Authenticating is proving it with something (password, birth certificate, DNA results).

In IT, we need each persone to have seperate IDs so we can properly identify them and assign them rights (I'm Joe Schmoe! Me too! Me three! but we don't all need the same access). We also need people to prove they are who they claim to be by authenticating themselves (remember the three factors of authentication, something you know, have, or are).