Do sessions really violate RESTfulness?
Cookies are not for authentication. Why reinvent a wheel? HTTP has well-designed authentication mechanisms. If we use cookies, we fall into using HTTP as a transport protocol only, thus we need to create our own signaling system, for example, to tell users that they supplied wrong authentication (using HTTP 401 would be incorrect as we probably wouldn't supply Www-Authenticate
to a client, as HTTP specs require :) ). It should also be noted that Set-Cookie
is only a recommendation for client. Its contents may be or may not be saved (for example, if cookies are disabled), while Authorization
header is sent automatically on every request.
Another point is that, to obtain an authorization cookie, you'll probably want to supply your credentials somewhere first? If so, then wouldn't it be RESTless? Simple example:
- You try
GET /a
without cookie - You get an authorization request somehow
- You go and authorize somehow like
POST /auth
- You get
Set-Cookie
- You try
GET /a
with cookie. But doesGET /a
behave idempotently in this case?
To sum this up, I believe that if we access some resource and we need to authenticate, then we must authenticate on that same resource, not anywhere else.
First, let's define some terms:
RESTful:
One can characterise applications conforming to the REST constraints described in this section as "RESTful".[15] If a service violates any of the required constraints, it cannot be considered RESTful.
according to wikipedia.
stateless constraint:
We next add a constraint to the client-server interaction: communication must be stateless in nature, as in the client-stateless-server (CSS) style of Section 3.4.3 (Figure 5-3), such that each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server. Session state is therefore kept entirely on the client.
according to the Fielding dissertation.
So server side sessions violate the stateless constraint of REST, and so RESTfulness either.
As such, to the client, a session cookie is exactly the same as any other HTTP header based authentication mechanism, except that it uses the Cookie header instead of the Authorization or some other proprietary header.
By session cookies you store the client state on the server and so your request has a context. Let's try to add a load balancer and another service instance to your system. In this case you have to share the sessions between the service instances. It is hard to maintain and extend such a system, so it scales badly...
In my opinion there is nothing wrong with cookies. The cookie technology is a client side storing mechanism in where the stored data is attached automatically to cookie headers by every request. I don't know of a REST constraint which has problem with that kind of technology. So there is no problem with the technology itself, the problem is with its usage. Fielding wrote a sub-section about why he thinks HTTP cookies are bad.
From my point of view:
- authentication is not prohibited for RESTfulness (otherwise there'd be little use in RESTful services)
- authentication is done by sending an authentication token in the request, usually the header
- this authentication token needs to be obtained somehow and may be revoked, in which case it needs to be renewed
- the authentication token needs to be validated by the server (otherwise it wouldn't be authentication)
Your point of view was pretty solid. The only problem was with the concept of creating authentication token on the server. You don't need that part. What you need is storing username and password on the client and send it with every request. You don't need more to do this than HTTP basic auth and an encrypted connection:
- Figure 1. - Stateless authentication by trusted clients
You probably need an in-memory auth cache on server side to make things faster, since you have to authenticate every request.
Now this works pretty well by trusted clients written by you, but what about 3rd party clients? They cannot have the username and password and all the permissions of the users. So you have to store separately what permissions a 3rd party client can have by a specific user. So the client developers can register they 3rd party clients, and get an unique API key and the users can allow 3rd party clients to access some part of their permissions. Like reading the name and email address, or listing their friends, etc... After allowing a 3rd party client the server will generate an access token. These access token can be used by the 3rd party client to access the permissions granted by the user, like so:
- Figure 2. - Stateless authentication by 3rd party clients
So the 3rd party client can get the access token from a trusted client (or directly from the user). After that it can send a valid request with the API key and access token. This is the most basic 3rd party auth mechanism. You can read more about the implementation details in the documentation of every 3rd party auth system, e.g. OAuth. Of course this can be more complex and more secure, for example you can sign the details of every single request on server side and send the signature along with the request, and so on... The actual solution depends on your application's need.
Actually, RESTfulness only applies to RESOURCES, as indicated by a Universal Resource Identifier. So to even talk about things like headers, cookies, etc. in regards to REST is not really appropriate. REST can work over any protocol, even though it happens to be routinely done over HTTP.
The main determiner is this: if you send a REST call, which is a URI, then once the call makes it successfully to the server, does that URI return the same content, assuming no transitions have been performed (PUT, POST, DELETE)? This test would exclude errors or authentication requests being returned, because in that case, the request has not yet made it to the server, meaning the servlet or application that will return the document corresponding to the given URI.
Likewise, in the case of a POST or PUT, can you send a given URI/payload, and regardless of how many times you send the message, it will always update the same data, so that subsequent GETs will return a consistent result?
REST is about the application data, not about the low-level information required to get that data transferred about.
In the following blog post, Roy Fielding gave a nice summary of the whole REST idea:
http://groups.yahoo.com/neo/groups/rest-discuss/conversations/topics/5841
"A RESTful system progresses from one steady-state to the next, and each such steady-state is both a potential start-state and a potential end-state. I.e., a RESTful system is an unknown number of components obeying a simple set of rules such that they are always either at REST or transitioning from one RESTful state to another RESTful state. Each state can be completely understood by the representation(s) it contains and the set of transitions that it provides, with the transitions limited to a uniform set of actions to be understandable. The system may be a complex state diagram, but each user agent is only able to see one state at a time (the current steady-state) and thus each state is simple and can be analyzed independently. A user, OTOH, is able to create their own transitions at any time (e.g., enter a URL, select a bookmark, open an editor, etc.)."
Going to the issue of authentication, whether it is accomplished through cookies or headers, as long as the information isn't part of the URI and POST payload, it really has nothing to do with REST at all. So, in regards to being stateless, we are talking about the application data only.
For example, as the user enters data into a GUI screen, the client is keeping track of what fields have been entered, which have not, any required fields that are missing etc. This is all CLIENT CONTEXT, and should not be sent or tracked by the server. What does get sent to the server is the complete set of fields that need to be modified in the IDENTIFIED resource (by the URI), such that a transition occurs in that resource from one RESTful state to another.
So, the client keeps track of what the user is doing, and only sends logically complete state transitions to the server.
First of all, REST is not a religion and should not be approached as such. While there are advantages to RESTful services, you should only follow the tenets of REST as far as they make sense for your application.
That said, authentication and client side state do not violate REST principles. While REST requires that state transitions be stateless, this is referring to the server itself. At the heart, all of REST is about documents. The idea behind statelessness is that the SERVER is stateless, not the clients. Any client issuing an identical request (same headers, cookies, URI, etc) should be taken to the same place in the application. If the website stored the current location of the user and managed navigation by updating this server side navigation variable, then REST would be violated. Another client with identical request information would be taken to a different location depending on the server-side state.
Google's web services are a fantastic example of a RESTful system. They require an authentication header with the user's authentication key to be passed upon every request. This does violate REST principles slightly, because the server is tracking the state of the authentication key. The state of this key must be maintained and it has some sort of expiration date/time after which it no longer grants access. However, as I mentioned at the top of my post, sacrifices must be made to allow an application to actually work. That said, authentication tokens must be stored in a way that allows all possible clients to continue granting access during their valid times. If one server is managing the state of the authentication key to the point that another load balanced server cannot take over fulfilling requests based on that key, you have started to really violate the principles of REST. Google's services ensure that, at any time, you can take an authentication token you were using on your phone against load balance server A and hit load balance server B from your desktop and still have access to the system and be directed to the same resources if the requests were identical.
What it all boils down to is that you need to make sure your authentication tokens are validated against a backing store of some sort (database, cache, whatever) to ensure that you preserve as many of the REST properties as possible.
I hope all of that made sense. You should also check out the Constraints section of the wikipedia article on Representational State Transfer if you haven't already. It is particularly enlightening with regard to what the tenets of REST are actually arguing for and why.