What is SAML (and what has it go to do with OAuth) ?

One of the outputs of the Linkey project is an open source PHP library that implements the SAML bearer assertion extension (draft 13 at time of writing) to the OAuth 2.0 specification.

But what is SAML?

SAML, or Security Assertion Markup Language, is an XML-based framework that allows identity and security information to be shared across security domains – for example between two websites.

Information is exchanged through the concept of assertion tokens, which are XML documents that state facts about the assertion’s subject. For example, a SAML assertion about me might look like:

<saml:Response ID="_257f9d9e9fa14962c0803903a6ccad931245264310738" IssueInstant="2012-09-03T18:45:10Z" Version="2.0">

    <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">
        https://www.lincoln.ac.uk
    </saml:Issuer>

    <saml:Status>
        <saml:StatusCode Value="urn:oasis:names:tc:SAML:2.0:status:Success"/>
    </saml:Status>

    <saml:Assertion ID="_3c39bc0fe7b13769cab2f6f45eba801b1245264310738" IssueInstant="2012-09-03T18:45:10Z" Version="2.0">
    <saml:Issuer Format="urn:oasis:names:tc:SAML:2.0:nameid-format:entity">
        https://www.lincoln.ac.uk
    </saml:Issuer>

    <saml:Subject>
        <saml:NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified">
            abilbie@lincoln.ac.uk
        </saml:NameID>
        <saml:SubjectConfirmation Method="urn:oasis:names:tc:SAML:2.0:cm:bearer">
            <saml:SubjectConfirmationData NotOnOrAfter="2012-09-03T18:50:10Z" Recipient="https://sso.lincoln.ac.uk"/>
        </saml:SubjectConfirmation>
    </saml:Subject>

    <saml:Conditions NotBefore="2012-09-03T18:45:10Z" NotOnOrAfter="2012-09-03T18:50:10Z">
        <saml:AudienceRestriction>
            <saml:Audience>https://sso.lincoln.ac.uk</saml:Audience>
        </saml:AudienceRestriction>
    </saml:Conditions>

    <saml:AuthnStatement AuthnInstant="2012-09-03T18:45:10Z">
        <saml:AuthnContext>
            <saml:AuthnContextClassRef>urn:oasis:names:tc:SAML:2.0:ac:classes:unspecified</saml:AuthnContextClassRef>
        </saml:AuthnContext>
    </saml:AuthnStatement>

    <saml:AttributeStatement>

        <saml:Attribute Name="name">
            <saml:AttributeValue xsi:type="xs:anyType">Alex Bilbie</saml:AttributeValue>
        </saml:Attribute>

        <saml:Attribute Name="network_id">
            <saml:AttributeValue xsi:type="xs:anyType">abilbie</saml:AttributeValue>
        </saml:Attribute>

        <saml:Attribute Name="division">
            <saml:AttributeValue xsi:type="xs:anyType">ICT Services</saml:AttributeValue>
        </saml:Attribute>

        <saml:Attribute Name="staff_directory_url">
            <saml:AttributeValue xsi:type="xs:anyType">http://staff.lncd.org/abilbie</saml:AttributeValue>
        </saml:Attribute>

    </saml:AttributeStatement>
    </saml:Assertion>
</saml:Response>

This assertion here states that my name is Alex Bilbie, my network ID is abilbie, my division is ICT Services and that my staff directory page is http://staff.lncd.org/abilbie.

Generally the assertion will be provided by an “identity provider” (a.k.a an IDP) and consumed by a “service provider” (a.ka. a SDP).

SAML is often used in single sign on environments. Let’s consider a student trying to sign in to the student union website:

  1. The student (the user) would visit student union website (the SDP)
  2. The SDP would realise the user isn’t yet signed in and so redirects the user to the single sign on endpoint (the IDP)
  3. The user authenticates with the IDP
  4. The IDP redirects the user back to the SDP with the assertion (by using a HTTP POST)
  5. The SDP consumes the assertion, validates it and signs the user in

How does SAML compare to OAuth?

For reference, this is how OAuth is used in a single sign on environment:

  1. The student (the user) would visit student union website (the client)
  2. The client would realise the user isn’t yet signed in and so redirects the user to the single sign on endpoint (the authentication server)
  3. The user authenticates with the authentication server
  4. The user grants the client access to their private data
  5. The authentication server redirects the user back to the client with an authorisation code in the URL
  6. The client exchanges the authorisation code for an access token with the authentication server by itself authenticating
  7. The client then sends a request to the resource server to request the user’s private data. The access token represents permission granted by the user for the client to act on their behalf.

Now the OAuth flow looks longer it contains one important stage which is a requirement of the specification, specifically step 4 which is where the user is explicitly asked if they want to allow the client to access their data and also what data will be accessed. You could emulate this in the SAML flow but I think the upfront honesty so to speak in the OAuth flow is a huge advantage for users. Also most OAuth authentication servers allow users to revoke access to clients as well (i.e. their access token for that client is destroyed which immediately cuts access).

In both flows, the client/SDP itself is verified. In SAML this is done through certificate exchanges, in OAuth the client’s ID is validated at step 3 and step 6.

How does the assertion extension work?

Looking at the OAuth sign-in flow above, at step 6 the client exchanges the authorisation code for an access token. There are a number of ways this can step can actually work and each different method is called an authorisation grant type. The grant described above (and the grant is the generally used in most OAuth implementations) is the authorization_code grant.

The assertion extension defines a new authorisation grant type of urn:ietf:params:oauth:client-assertion-type:saml2-bearer. There is also another parameter called client_assertion which is a base64 encoded representation of the XML assertion. The client/SDP’s own credentials are in the assertion instead of being sent as client_id and client_secret parameters as in the authorisation code grant.

An example request would look like:

POST /token.oauth2 HTTP/1.1
Host: authz.example.net
Content-Type: application/x-www-form-urlencoded

grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Asaml2-
bearer&assertion=PEFzc2VydGlvbiBJc3N1ZUluc3RhbnQ9IjIwMTEtMDU
[...omitted for brevity...]aG5TdGF0ZW1lbnQ-PC9Bc3NlcnRpb24-

The specification includes a number of elements that must be included the assertion in order to provide the authorisation server with all the details it needs about the request.

As I understand it there is also an optional scope parameter that can be passed in the request any scopes that the client/SDP requires.

Update w/c 3rd September

I’ve spent the last few weeks working away on the OAuth PHP library which now includes a resource server as well as an authentication server. I’ve also started merging in Phil Sturgeon’s OAuth 2.0 client code library, which when I’ve finished, will result in a mean, lean PHP library for working with any aspect of OAuth 2.0 (authentication, resource sharing or client side). Both server classes are now fully unit tested and I’m at 90% code coverage for all of the methods. I’ve started writing documentation for the library too and I’m going to write a tutorial on how to build an OAuth secured API server in the CodeIgniter framework.

On the UAG side, both Tim and I have been reading Mastering Microsoft Forefront UAG 2010 Customisation (Amazon link). I’ve now got some ideas about how we can easily integrate the university’s OAuth server that I developed with our UAG install. More on this soon.

Update w/c 13th August

I’ve not blogged in a few weeks so I’m just going to give a quick update on what I’m working on.

First of all, I’ve finished porting my old CodeIgniter based OAuth server code to a composer library. It follows the latest version of the specification (draft 31) and supports the authorisation code and implicit grants. I’ve also started work on some unit tests to ensure that the code is bulletproof. There is a small bit more work to do to support the client resource owner password credentials grant and the client credentials grant. Once I’ve completed this work and finish the unit tests (I’m currently at about 60% code coverage) I’ll make a formal announcement on this blog, to the OAuth working group mailing list and to the PHP community.

I’ve been working with Paul Stainthorp in the library to capture examples of poor user experience when accessing electronic resources in the library. I’ve made videos showing how difficult some resources are to access, and I’ve also been going through recent NSS and level 1 and 2 surveys to find examples of students highlighting problems they’ve had accessing the resources.

OAuth 2.0 and the road to hell

Eran Hammer who, until a few days ago, was the main editor of OAuth 2.0 has written this very damning blog post about the protocol

http://hueniverse.com/2012/07/oauth-2-0-and-the-road-to-hell/

I’ve been following the working group for about a year now and there has been an awful lot of bickering and mindless discussion.

As I’ve stated already, I’ve spent the last few weeks working on a brand new library that implements the current spec (draft 30) and I do feel myself agreeing with some of his points:

  • Yes the specification mess, I’ve highlighted and sticky noted the printed document to death and the flow between different sections is really bad
  • Some new features such as refresh tokens are overly complicated and don’t benefit the protocol, and it means clients now have to maintain access token state added to complexity
  • Just reading the spec you can tell how much it has been influenced enterprise – many features are very open ended so that you bolt OAuth onto something else (or something else onto OAuth)
  • Bearer tokens over SSL/TLS by themselves are bad and I think signatures need to come back
    • Basically if I steal someone’s access token I can use that wily nilly, however with signatures the entire request is signed with the client’s secret key so unless the secret key is leaked you can’t just use access tokens by themselves

I disagree however that the protocol (in it’s current state) is complicated, they’ve done great work at making it a 3 (or 4) legged protocol to just 2 legs and everyone agrees that bit of the protocol has been done well.

In terms of his suggestion that alternatives to OAuth that are outside the reach of the IETF might crop up I’m quite interested in this and if something crops up I’ll stick my nose in, and if one doesn’t then I’d be interested in having a go in writing one myself as an output of Linkey.

In terms of the extension documents (which include the assertions (SAML) extension, i don’t know enough about SAML to make any sort of informed opinion about this, however I’ve also still yet to see any public implementation of it.

Quick update

I’ve not blogged in a little while so I thought I’d give an update on what is happening with Linkey.

First I’ve been working with Paul in the library to capture (videos and screenshots) examples of poor user experience accessing various resources including electronic journals and databases, printers, and users’ library accounts.

I’ve also been trawling through last year’s NSS results to find examples of where students particularly struggled to access IT and library resources. I will then produce visualisations of these.

It is important to capture these examples so that hopefully by the end of the project we can show how we’ve improved the situation.

This week we launched our Ezproxy service [note: requires authentication] which has already improved accessing a number of resources including articles from the American Chemical Society, EBSCO Publishing and Oxford University Press.

I’ve also been working away at the new PHP OAuth library. Originally this was just going to be code for implementing an authentication server and a resource server, but now it is also going to include client code too thanks to Phil Sturgeon offering to integrate his existing code. As a result we’re going to have a lean and mean library that can help anyone work with any aspect of OAuth 2.