Oh Data

Since the recent PDC09 I have been obsessing over OData and I need to write this post just to get it out of my head.  Microsoft has made it obvious that they are taking this protocol very seriously by integrating it into Sharepoint, Visual Studio, RIA Services, PowerPivot, and I expect to see it in the next version of Office and in the Dynamics products.  I think it is a great direction to be headed but I also have concerns.

Let me start by quoting the most important paragraph from the document that explains how OData extends AtomPub.

AtomPub, as specified in [RFC5023], in combination with the extensions defined in this document, is appropriate for use in Web services which need a uniform, flexible, general purpose interface for exposing create retrieve update delete (CRUD) operations on a data model to clients. It is less suited to Web services that are primarily method-oriented or in which data operations are constrained to certain prescribed patterns.

Let me paraphrase that.  If all your service is going to do for your client is “CRUD” on generic data then OData is appropriate.  As long as everyone keeps this in mind going forward we should not run into too much trouble.  However, there is a problem with this statement.  REST is not really appropriate for doing CRUD.  Harry Pierson sums it up best here.  What is worse, is that I am seeing some of the people behind the OData spec equating OData with ODBC

The problem is that ODBC allows clients to initiate transactions across multiple requests. REST does not allow this as it would violate the stateless constraint.  REST does not need this because it is intended to address a completely different layer of the application architecture than ODBC.  REST provides a way to deliver Domain services.  I.e.  If you maintain weather data, REST provides you a easy way to expose “Today’s Weather”, “Last Week’s weather for Detroit”, “Average Rainfall in Orlando for the month of June”.  ODBC is aimed at the layer that exposes the data points for a specific place at a specific date and time. 

ODBC exposes dumb data, REST exposes intelligently presented information.

In an ODBC application it is the client that does something intelligent with the data before presenting it to the user.  In a REST application, usually the client simply makes the “intelligent information” pretty.  REST and ODBC are not comparable.

So is OData useful?  Absolutely it is useful to people who want to manipulate generic information, like for example Sharepoint lists, or data to feed into PowerPivot or Excel.  If you have a need to expose a generic data store to a client that will do graphing, statistical analysis, or some kind of visualization like rendering Mars Rover data then it could be very useful. 

However, if you want to provide a service that delivers intelligent information that is specific to a particular domain then I believe and apparently the authors of the spec believe that OData is not appropriate.

Beyond my fear of developers attempting to use OData for unintended purposes there are few other things that I think should be fixed in the OData spec. 

The Atom Entry content element should not use application/xml as the media type.  The content contains XML that is specifically related to the Entity Data Model and should be identified as such.  A media type such as application/EDM-Instance+xml may be sufficient.  What would be even better is if that content element contained a link to the CSDL file that defines the EntityType and that is currently accessed by constructing an URI with [Service]/$metadata.  Oh yeah, and maybe a precise media type on the metadata would be good too!

Client side URI construction is really nasty habit to get into.  I think for the most part, MS can get away with the construction of query parameters like $skip, $top, and $orderby, but to actually construct the path segments of a URI is just going lead to client-server coupling that will hurt in the future.

I haven’t read the entire OData spec in detail but it is interesting to see the complications that are introduced because they have not strictly followed the hypermedia constraint.  For example, it has become necessary to create custom HTTP headers to manage versioning of the “protocol” due to the use of client constructed URIs.  If those URIs were delivered as Links with rel attributes then the versioning would be limited to the media type of the content.  Yeah, I realize that you can’t create links for every combination of query parameter/ sub resource, but hey, I’m not the one saying that creating a REST interface for generic data was a good idea in the first place ;-).

It is fascinating how difficult it is to beat the RPC mentality out of people.  Even though the OData spec is built on top of AtomPub, the authors have gone to great lengths to document the OData protocol in RPC terms and then map the RPC call to an HTTP request.  When you find yourself creating documentation titles like “RetreiveEntitySet Request”, “RetreiveEntity Request”, “RetreiveComplexType Request”, “RetreivePrimitiveProperty Request” and there is actually some valuable information that distinguishes one of those requests from the other then you are violating the uniform interface.  The idea is that the documentation can read “Use HTTP GET to retrieve representation of the resource from the URI”. Look at how simple the AtomPub spec is in comparison.

I think it is great that Microsoft have recognized the value of a RESTful protocol like AtomPub and they have taken the steps to incorporate this type of interface into many of their products.  I understand what they are trying to do with their URL construction techniques, I know exactly why they have introduced a MERGE verb and have created a batch request mechanism, because I too have been down this path before.  However, while there some areas they are justified in straying from the REST constraints, there are others that are definitely not and the protocol is suffering from it.

No Comments

Add a Comment

comments powered by Disqus