Authentication for RESTful APIs

Lately I've been working on designing authentication (authN) and authorization (authZ) services for an API Gateway layer sitting on top of a collection supposedly RESTful APIs written by a diverse and disconnected population of developers.

One of the many challenges I've faced is that it turns out that "REST" means different things to different people. I've been looking for a simple way to explain to developers what a high quality RESTful API looks and functions like. While I have found some good material, I felt I needed pull together a few different concepts, so I wrote this.

Why is being fully RESTful important? Turns out that poorly designed and implemented, RESTful APIs are harder to design authentication and authorization services. First I want to discuss RESTful APIs in general, so we can agree on what they are and are not. Then I will explain why weakly RESTful APIs are harder to implement authN/Z.

Reading the documentation for supposedly "RESTful" APIs could lead one to believe that all an API needs to be RESTful is to deliver JSON over HTTP. Unfortunately, this largely misses the point of REST. But it is also really all that's common between the vast majority of "RESTful" APIs.

Representation State Transfer (REST) was developed by Roy Fielding and I'm fortunate to work with one of his students. So I've been picking his brain and working with him to provide authN/Z for an API he is writing.

Definition of RESTful

REST defines 6 criteria for services:

Services are delivered in a client-server model, separated by API boundaries over a network. Clients handle presentation, servers hande logic and storage
Communication with services is stateless, which means every request contains all the information required to service the request
Messages should be cacheable by the client, server, or any proxy in between, or if not cachable, mark themselves accordingly
Clients should not be able to tell whether they are connected directly to the source of the service, there may be any number of layers in between, to allow for load balancing, caching, etc
Code for presentation can be provided to clients on demand
Data and functions are via a uniform interface

HTTP and networked APIs give you criteria 1-5. The internet is client-server, HTTP is stateless and allows caching and services can deliver JavaScript to aid in presentation by the client.

The problem is criteria 6--the uniform interface for data and functions.

This 6th criteria can be further broken down into 4 separate sub-criteria:

Uniform identification of resources in URIs on the web, and in representational formats, like JSON
Manipulation (Create, Read, Update, Delete or "CRUD ") through resource representations and their attached metadata
Messages must be self-descriptive, including enough information to know how to use them, like MIME types
Hypermedia As The Engine Of Application State (HATEOAS)

Hypermedia as the Engine of Application State
HATEOAS, or lack thereof, is where most APIs fail at being RESTful. To determine if an API is using HATEOAS, ask yourself a simple question, "What do I need to know ahead of time in order to use the service?" To be HATEOAS, the answer is: "The domain and protocol". With these two pieces of information, it should be possible to fully explore the service. Basically as soon as you feel a need to document your service, you're probably not using HATEOAS and may be rapidly diverging away from REST.

It turns out there is a really solid REST service out there for us to look at. It's called "The World Wide Web". Think about Amazon. What do you need to know to use it? You need to know the protocol, HTTP, and the domain, amazon.com. You don't need documentation. You launch a web browser which you know supports HTTP and you type in amazon.com.

The service then provides you with a list of links for books, DVDs and loads of other medium sized dry goods. You don't need documentation to know that searching is done with /search?p=stuff and books are at /products/books because each HTML page returned from amazon.com contains all the data needed to easily make a decision about what the service does and how to proceed to the next step.

How many APIs behave like this? Virtually all have documentation that lists all of the types of resource and actions that can be performed. But when you do a GET to the root domain for the service, the service doesn't provide hyperlinks to navigate its data and functionality, and resources don't contain URIs for related resources, instead they rely on application-specific identifiers, such as an id or username field, that must be supported by the client.

Richardson Maturity Model
The Richardson Maturity Model (RMM) for evaluating REST services is a straight forward method for scoring RESTful a services. The model defines 4 levels, 0-3. I've included links to the best article I've found to date that explains the levels in a very simple manner with a clear example. The criteria for each level are as follows:

Level 0
HTTP transport and Remote Procedure Call (RPC) style transactions.

Level 1

Resources, So now rather than making all requests to a singular service endpoint, we now start talking to individual resources.

Level 2

HTTP verbs for Create, Read, Update, Delete (CRUD) using HTTP POST, GET, PUT, DELETE respectively.

Level 3

Hypermedia Controls (HATEOAS) or simply adding URI for resources in responses so it's obvious what to do next.

RESTful-ish
Let's look at 3 APIs that claim to be RESTful, but none have achieved RMM level 3.

Mandrill
This API claims to be "mostly RESTful", but it's RMM level 0.

Endpoints are RPC-style functions, not resources. To get a user, you call info, with a user ID argument, rather than performing a GET on a resource like /user/:id. This fails RMM level 1.
All requests are HTTP POST, failing level 2
There are no links, and no self-documentation in responses about actions that can be taken, failing level 3
The Mandrill API is RESTful only in that it works over the web, and therefore satisfies criteria 1-5 automatically. Nothing at the level of the application itself is RESTful, it is instead just an RPC API that communicates using JSON

Digital Ocean

They claim to be "fully RESTful", even going so far as to linking to the Wikipedia article for REST.

Although it uses HTTP methods properly, returns correct status codes, it lacks hypermedia control
Poor support for generic REST clients. This makes it level 1-2

Twitter

Endpoints are resources, passes level 1
HTTP verbs are used, although their usage doesn't follow good practices. For example, to delete a tweet, a POST request must be issued to /statuses/destroy/:id, therefore encoding the action in the URI rather than the verb. It would be better to issue a DELETE to /statuses/:id. The API arguably fails level 2
There are no links, and no self-documentation in responses, fails level 3

Why being fully RESTful is important

There are two major reasons to be RMM level 3 when going to the trouble of building APIs. The first major reason is to make your API user's lives better and easier. This will make them want to adopt your API over a competitors. The second reason, while selfish, is still none the less true. And that is that RMM level 0 APIs make it hard to design authentication and authorization systems to support them. Let's look at each of these major reasons in more detail.

The official way of consuming many APIs is to use one of a handful of client libraries created by the service provider, or build your own by reading the documentation. It's common that a library is not provided for your language of choice, and any third party versions aren't actively being developed or the documentation is poor, thus making full use of the API's abilities impossible. And when the libraries do exist, they all do essentially the same thing: they make web requests, and based on the data they receive, they figure out how to make more requests. This is all done hidden from view.

To understand how bad this is, imagine a world where each website published their own browser, that you had to use to browse the site, just because they have decided to publish content in different formats, or have an obscure or proprietary way of linking to other content.

Some client libraries published by API providers claim additional features (over what your own implementation might have) such as smart caching, performance enhancements for slow connections on mobile data, and 'live' connections. But all of these are possible with open web standards, and a well written generic REST library could support all of them and more, for any service that provided full REST support.

Instead, imagine a world like this:

Writing an application that consumed multiple web services, and all the implementation specific code you had to write was the list of links you wanted to traverse.
Not having to re-write or update your library when changes were made to the target API, being able to take advantage of performance improvements in the API with no additional development of your library.
Querying a web service being no different to using an Object Relational Mapper (ORM) to query your database, even across services from multiple providers.

All of this is possible. But not with the generally poor quality of level 0-2 RESTful APIs out there. We need more standards and better compliance, but they aren't going to come until developers are actively using, providing, and requesting REST APIs.

Finally, what does any of this have to do with authentication and authorization? The answer is pretty simple. If a "RESTful" API is only level 0, how can authorization be tied to resources? (Which they most often need to be). The simple answer is that you can't tie authorization to resources when the resources are hidden behind RPC-style calls. Or to be more precises, authorization has to happen in the service because the service is the only part that understands the resources.

Instead imagine a world were you can simply map a group to a resource URI. In such a world, you can easily implement authentication and authorization in your API management layer. Say via an API Gateway. But without resource URIs, all you can do is send the service a list of groups for the caller's identity and this makes the service bigger, more complex and makes it impossible to build a portfolio of microservices that share a common authentication and authorization layer.

There are four approaches to implementing authentication and authorization with microserices. Approach 1 is ideal, but is really only possible with RMM level 3.

Approach 1

Do authN/Z globally

Layer	AuthN	AuthZ
API Gateway	X	X
Microservice

pros

makes developer happy :)
less implementation errors
less risk of forgetting to handle at all
centrally defined and handled
smaller micro services
less repetition in the code in the micro services

cons

service can not have fine grained object permissions
all or nothing authorization
global auth bottleneck

Approach 2

Do authentication globally, and authorization in every microservice

Layer	AuthN	AuthZ
API Gateway	X
Microservice		X

pros

global authentication is easier to manage/control
fine grained object permissions are possible

cons

slightly more code in the micro services
needs some effort to have an overview what you can do with which permission

Approach 3

Do authentication and authorization in every microservice

Layer	AuthN	AuthZ
API Gateway
Microservice	X	X

pros

fine grained object permissions are possible
different user authentication mechanisms are possible for different microservices

cons

error prone
many repetitions
bigger micro services
needs some effort to have an overview what you can do with which permission
no happy developer :-(

Approach 4

Do authN at microservice, auth Z globally

Layer	AuthN	AuthZ
API Gateway		X
Microservice	X

pros

is listed only for completeness.

cons

it does not make sense
worst of both worlds
no fine grained object permissions
error prone
tedious repetitive authentication for consumer

Search This Blog

Paul Ericson