FOR XML PATH and CQRS

In a recent twitter exchange with Colin Jack I claimed that SELECT {…} FROM […] FOR XML PATH() is pretty much all you need for providing the Reporting side of CQRS.  Ok, so maybe I overstated it a tad, but it is a valuable technique.  He suggested I blog about it, so I figured even if people don’t use it for CQRS, some of the neat things you can do with FOR XML PATH may be interesting to a few people.

Command Query Responsibility Separation(CQRS) is an interesting architecture that is suitable for highly scalable applications with complex business logic.  Instead of me doing a horrible job of trying to explain it, here are links to people who know what they are talking about.

Greg Young – http://codebetter.com/blogs/gregyoung/archive/2009/07/15/unshackle-your-domain.aspx

UDI Dahan – http://www.udidahan.com/2009/12/09/clarified-cqrs/

The reporting side of CQRS requires a simple mechanism for getting read-only data from a server down to the presentation tier with minimal effort.  I am suggesting the use of XML as a wire format and I am going to make the assumption that we have chosen a relational database as the store for the reporting database. More specifically, we require Microsoft SQL Server because, from what I can find, MS SQL Server is the only database engine that supports this level of flexibility when generating XML from relational tables.  

Ideally we should be able to create any desired structure of XML based on data in our relational database.  We need the flexibility to generate data as attributes or elements, create nested structures, use namespaces or not, at a minimum.

I’m going to start showing some of the basics of the FOR XML PATH syntax and build on it until we are creating relatively complex xml documents in a single SQL query.

This is pretty much the simplest query that you can create:

SELECT Code AS 'Code',
       Name AS 'Name'
FROM tblCustomer
FOR XML PATH('Customer')

It produces XML that looks like this:

<Customer>
    <Code>CUSTOMER-ONE</Code>
    <Name>Hypro Networks Inc.</Name>
    </Customer>
<Customer>
    <Code>CUSTOMER-A</Code>
    <Name>Customer A</Name>
</Customer>

The only problem with this is that it is not a valid XML document as there is more than one root element.  To create a proper XML document, you need to do:

SELECT Code AS 'Code',
        Name AS 'Name'
FROM tblCustomer
FOR XML PATH('Customer'), ROOT('Customers')

which then gives you,

<Customers>
  <Customer>
    <Code>CUSTOMER-ONE</Code>
    <Name>Hypro Networks Inc.</Name>
  </Customer>
  <Customer>
    <Code>CUSTOMER-A</Code>
    <Name>Customer A</Name>
  </Customer>
</Customers>

 

These examples showed how to create data as XML elements, but sometimes it is desirable to use attributes.  Attributes take quite a bit less space and suitable for most content that is small and does not contain significant whitespace.

SELECT Code AS '@Code',
        Name AS '@Name'
FROM tblCustomer
FOR XML PATH('Customer'), ROOT('Customers')

By adding an @ symbol to the front of the column name, the data will be output as attributes:

<Customers>
  <Customer Code="CUSTOMER-ONE" Name="Hypro Networks Inc." />
  <Customer Code="CUSTOMER-A" Name="Customer A" />
</Customers>

One of the things that I like about using XML is that it is easy to group related pieces of data into other elements.  The following query demonstrates how to do this:

SELECT Code AS '@Code',
        Name AS '@Name',
        Tel1 AS 'Contact/@Tel',
        Fax1 AS 'Contact/@Fax1'
FROM tblCustomer
FOR XML PATH('Customer'), ROOT('Customers')

Adding the slash into the column name directs SQL Server to create a new element to contain other nodes.

<Customers>
  <Customer Code="CUSTOMER-ONE" Name="Hypro Networks Inc.">
    <Contact Tel="1-993-345-4146" Fax1="1-993-345-4140" />
  </Customer>
  <Customer Code="CUSTOMER-A" Name="Customer A">
    <Contact Tel="872-494-2000" Fax1="872-494-3008" />
  </Customer>
</Customers>

Just one small warning, columns must be adjacent in the SQL query to exist under the same child element and all attributes of the parent element must be specified in the SQL query before the columns that will be in a child element.

So far we have only considered data from a single table.  Using a simple SQL join it is easy to pull in data from other tables, but we can also create parent/child documents.

SELECT  Code AS '@Code',
        Invoice_Date AS '@Date',
        (SELECT ii.qty AS 'Quantity',
                LEFT(ii.description,20) AS 'Description'
         FROM tblinvoitem ii WHERE ii.invoice_id = iv.id FOR XML PATH('Item'),type)
FROM tblInvoice iv
FOR XML PATH('Invoice'), ROOT('Invoices')

 

<Invoices>
  <Invoice Code="21201     " Date="2005-05-26T00:00:00">
    <Item>
      <Quantity>1.0000</Quantity>
      <Description>To supply material a</Description>
    </Item>
  </Invoice>
  <Invoice Code="21200     " Date="2005-05-18T00:00:00">
    <Item>
      <Quantity>25.0000</Quantity>
      <Description>S.S. Control Box Cov</Description>
    </Item>
    <Item>
      <Quantity>0.0000</Quantity>
      <Description>S.S. Main Control Bo</Description>
    </Item>
  </Invoice>
</Invoices>

The extra “type” parameter on the end of the subquery indicates to SQL that we want to embed the contents of the sub query as XML rather than just an escaped string.

One limitation that you will probably run into once you create this type of document is that there is no obvious way to insert attributes and elements into the root node.  This can be overcome with a small trick:

SELECT '1.0' AS '@Version',
        'List of Invoices' AS 'Summary',
( SELECT  Code AS '@Code',
        Invoice_Date AS '@Date',
        (SELECT ii.qty AS 'Quantity',
                LEFT(ii.description,20) AS 'Description'
         FROM tblinvoitem ii WHERE ii.invoice_id = iv.id FOR XML PATH('Item'),type)
FROM tblInvoice iv
FOR XML PATH('Invoice'),type)
FOR XML PATH('Invoices')

Using an outer query you can specify the attributes and elements that you want to appear in the root.  Note that there is no alias for the sub query column. 

<Invoices Version="1.0">
  <Summary>List of Invoices</Summary>
  <Invoice Code="21201     " Date="2005-05-26T00:00:00">
    <Item>
      <Quantity>1.0000</Quantity>
      <Description>To supply material a</Description>
    </Item>
  </Invoice>
  <Invoice Code="21200     " Date="2005-05-18T00:00:00">
    <Item>
      <Quantity>25.0000</Quantity>
      <Description>S.S. Control Box Cov</Description>
    </Item>
    <Item>
      <Quantity>0.0000</Quantity>
      <Description>S.S. Main Control Bo</Description>
    </Item>
  </Invoice>
</Invoices>

And of course, no discussion on XML is complete without discussing namespaces.  Here is how you can add namespaces to the XML result:

WITH XMLNAMESPACES(DEFAULT 'http://example.org/Invoices')
SELECT '1.0' AS '@Version',
        'List of Invoices' AS 'Summary',
( SELECT  Code AS '@Code',
        Invoice_Date AS '@Date',
        (SELECT ii.qty AS 'Quantity',
                LEFT(ii.description,20) AS 'Description'
         FROM tblinvoitem ii WHERE ii.invoice_id = iv.id FOR XML PATH('Item'),type)
FROM tblInvoice iv
FOR XML PATH('Invoice'),type)
FOR XML PATH('Invoices')

I’m not going to show you the output of this, because it is ugly!  Actually, the annoying part is that it puts the namespace on the root and then on each row of the sub query. 

Hopefully these examples have shown some of the capability of FOR XML PATH but I find that sometimes technologies work well while you are working on arbitrary examples, but then you hit the real world and they seem to fall short.  So, I decided to try and create an Atom feed using FOR XML PATH.  Here is what I came up with.

WITH XMLNAMESPACES(DEFAULT 'http://www.w3.org/2005/Atom')
SELECT 'Customers' AS 'title',
        'List of customers' AS 'subtitle',
        'http://tavis.net/Customers' AS 'id',
        REPLACE(CONVERT(varchar,(SELECT MAX(COALESCE(ModifiedDate,CreatedDate)) FROM tblCustomer), 121),' ','T') + 'Z' AS 'updated',
    (SELECT cu.Code AS 'title',
            REPLACE(CONVERT(varchar,COALESCE(ModifiedDate,CreatedDate) , 121),' ','T') + 'Z' AS 'updated',
            'Darrel Miller' AS 'author/name',
            'http://tavis.net/Customer/' + LTRIM(STR(cu.ID)) AS 'id',
            (SELECT *
                FROM (SELECT 'alternate' AS '@rel',
                    'http://tavis.net/Customer/1.html' AS '@href'
              UNION
              SELECT 'edit' AS '@rel',
                    'http://tavis.net/Customer/1.html' AS '@href') l
             FOR XML PATH('link'),type ),
            cu.ModifiedDate AS 'updated',
            cu.Name AS 'summary'
        FROM tblCustomer cu
        FOR XML PATH('entry'),type  )
FOR XML PATH('feed')

 

Ok, so it is not pretty, but with a few T-SQL functions it could be cleaned up quite a bit.  This query produces an atom feed that validates at validator.w3.org/feed and it does it in 8ms on my cheapo server with 20 entries .

The trickiest part of the above query was creating the two link elements.  When you try and create two child elements with the same name, by default SQL Server will attempt to merge the two elements.  By using the UNION and a sub query I was able to create the two separate elements.

So you really can create real world XML documents using FOR XML PATH. I believe this is a useful tool to have under your belt when you are trying to quickly get data out of a database down to a client tier.  It would also be really nice if other database vendors picked up on this feature and implemented in their engines.  One area that I didn’t cover but that is also very useful is that you can create T-SQL Functions that return XML and then call them recursively.  This allows you to build hierarchical XML documents.   My tests so far have also shown that building trees this way is very quick.

I’m hoping to do a follow up article that shows how you can use the XML from FOR XML PATH as input to XSLT to do all sorts of other interesting things like create audit triggers, data import scripts, create HTML pages, XAML  and JSON documents. 

Woe is me, the WOA unmanifesto

This started as a comment on the blog post here, but it got too long.

I have two questions for Dion and a few comments.
1) Have you read (not just skimmed a few times) Roy’s dissertation on REST?
2) Have you written both a REST service and a REST client that is in production today?

With all due respect, without actually creating REST applications, I don’t think anyone is qualified to attempt to define principles relating to REST and based on your Twitter bio:

Internationally recognized business strategist, enterprise architect, keynote speaker, author, blogger, and consultant on Web 2.0, SOA, and next-gen business

you sound like a talker, not a walker.  Forgive me if I am wrong.

Here are my comments on the suggested principles, I’ve tried to be as constructive as I can muster:

Principle #1: I think it is a significant oversimplification to refer to a resource  as data. 

Principle #2: Can you clarify what you mean by "favor granularity and depth in linkage"?

Principle #3: I am pretty sure no-one has ever said that URIs should be self descriptive.  In REST the request message should be self-descriptive, the URI being just one part of the message.   The statement “URI should indicate what data format is being used and indicate nested elements with URL segmentation” is way too strong, maybe if you replace “should” with “can”.

Principle #4:  This idea that all legacy databases should be exposed as data-oriented REST apis is crazy-talk.  REST is an architectural style for building distributed APPLICATIONS.  I.e. Exposing functionality over the web, not just data. Your example of cloud computing API’s is a perfect example.  The Sun Cloud API is one of the best REST examples out there at the moment, but it exposes functionality for manipulating those cloud applications.  Functions like turning the instances on and off.  It’s far more than just exposing data.

Principle #5:  You seem to be describing benefits rather than a principle.

#6:  Ok

#7: I don’t get this concept at all.  Does this mean we should all be using text/plain if we want the widest audience?

#8: Maybe I don’t want my resources to crawled by an SEO.  Even if I do, conforming to the uniform interface and delivering standard format representations should be sufficient to meet any search engine requirements.

#9:  “higher realized value”?  Are you sure this isn’t another one of those spoof manifestos?

#10: I think you are referring to the idea that a resource should only have one canonical URI.  I see this topic debated regularly and I do not yet see any consensus.

Ok, I give up for the moment, I’m a bit too depressed to continue.

HttpContent instead of streams

I think the HttpContent class is my favourite part of this library.  This class acts as a container for the content that you received or are about to send.

Handling returned content

When you do make an http request with this library, the body of the response is wrapped inside an HttpContent object. So, when you do:

var content = HttpClient.Get(“http://www.google.com”);

what you get back is an HttpContent object.  How you convert that into something useful, depends on what type of data the content object contains.  The HttpContent.ContentType property will tell you the Internet-media-type type of the data you received.

If you get back text/plain then you can access it like this,

var mytext = content.ReadAsString();

if you get application and application/octet-stream and you know what to do with the bytes, you can simply do

var mybytearray = content.ReadAsByteArray();

However, these examples are pretty primitive.  If you reference the DLL Microsoft.Http.Extensions you will find a variety of richer data extraction methods that have been implemented as extension methods on the HttpContent class.  So you can then do:

var xmlElement = content.ReadAsXElement()
var xmlReader = content.ReadAsXmlReader()

to access stuff that comes to you as application/xml.

One of the more sophisticated methods is ReadAsSyndicationFeed.  This method leverages the RSS/Atom wrappers that are provided by WCF’s System.ServiceModel.Web DLL.

var client = new HttpClient();
var response = client.Get("http://www.stackoverflow.com/feeds");
var feed = response.Content.ReadAsSyndicationFeed();
foreach (SyndicationItem item in feed.Items) {
    Console.WriteLine(item.Title.Text);
}

This example pulls an Atom feed from the front page of Stackoverflow.com.  This really shows the beauty of a standardized data format like Atom.  Those few lines of code above will work in so many places across the web, and now with Microsoft pushing the new OData standard which is based on the Atom Publishing Protocol, many more of the Microsoft products will expose data in this way. e.g. Sharepoint 2010, Azure.

For those of you who feel the burning desire to pull custom objects across the wire, you don’t have to feel ignored as these following methods will take care of all the deserialization work and return you a nice static type.

var customer = content.ReadAsXmlSerializable<Customer>()
var customer = content.ReadAsDataContract<Customer>()

and finally if you want to pull objects across the wire, but are offended by angle brackets, you can use curly braces too.

var customer = content.ReadAsJsonDataContract<Customer>()

The use of extension methods here is really quite elegant because it makes it really easy to add your own “ReadAs” methods and use them in a completely consistent way.  The fact that many of the provided extension methods are factored out into their own Dll means that you do not need to deploy that library, and you can use just your own extension methods.  This has the additional benefit of allowing you to use this library without taking a dependency on WCF if you don’t want to.

 

Sending Content

To send content you can use either PUT or POST and the way you prepare the content is identical.  The basic process looks like this:

var client = new HttpClient()
var content = HttpContent.Create(<whatever you want to send>);
client.Post(content); // or client.Put(content);

The only part that changes is how you create the HttpContent object.

Suppose you want to send some simple text

var content = HttpContent.Create(“Here is some content”, "text/plain");

or maybe just an array of bytes.

var content = HttpContent.Create(new byte[] { 1, 2, 3 }, "application/octet-stream");

How about sending a file?

var c = HttpContent.Create(new FileInfo(“myfile.zip”), "application/octet-stream");

If you are dealing with an existing API that expects a POST from an html form then you will need to post using the content-type application/x-www-form-urlencoded.  That is as easy as,

var c = HttpContent.Create(new HttpUrlEncodedForm() { { "a", "1" }, { "b", "2" } });

and for a more concrete example, twitter always makes a good test subject.  Here is how you do a status update,

var client = new HttpClient();
client.DefaultHeaders.Authorization = Credential.CreateBasic("username","password");
var form = new HttpUrlEncodedForm();
form.Add("status","Test tweet using Microsoft.Http.HttpClient");
var content = HttpContent.Create(form);
var resp = client.Post("http://www.twitter.com/statuses/update.xml", content);

 

It is pretty easy to continue this pattern and create your own factory methods for converting your favourite data type into an HttpContent object.  There are are quite a few other nice features about the content object relating to how it manages the underlying stream, when it reads from the stream, if it buffers the stream and other such nitty gritty, but I’ll save those details for another day.  To keep track of the other articles I am attempting to write on the subject, you can see the summary post here.

HttpClient – The basics

Before I go into any details I thought it would be valuable to give some basic examples of how to use the HTTPClient.

Retrieve some HTTP content from an URL

var client = new HttpClient();
var response = client.Get("http://example.org");

Post some content to an Url

var client = new HttpClient();
var content = HttpContent.Create("This is some string content");
var response = client.Post("http://example.com", content);

Authenticated get

var client = new HttpClient();
client.DefaultHeaders.Authorization = Credential.CreateBasic("user", "password");
var response = client.Get("http://example.org");

and if you need to provide some kind of custom authentication scheme you can simply create credential like this

client.DefaultHeaders.Authorization = new Credential(“My secure authentication header”);
 

Interestingly, the methods Get and Post are not actually members of the HttpClient class.  They are extension methods that simply call one of the Send methods on the HttpClient class.  If it is more appropriate you can also call the Send methods directly. As you will see, there are plenty of overloads to suit your needs:

 
public HttpResponseMessage Send(HttpMethod method)
public HttpResponseMessage Send(HttpMethod method, Uri uri)
public HttpResponseMessage Send(HttpMethod method, Uri uri, RequestHeaders headers)
public HttpResponseMessage Send(HttpMethod method, Uri uri, HttpContent content)
public HttpResponseMessage Send(HttpMethod method, string uri)
public HttpResponseMessage Send(HttpMethod method, string uri, RequestHeaders headers)
public HttpResponseMessage Send(HttpMethod method, string uri, HttpContent content)
public HttpResponseMessage Send(HttpMethod method, string uri, RequestHeaders headers, HttpContent content)
public HttpResponseMessage Send(HttpMethod method, Uri uri, RequestHeaders headers, HttpContent content)
 

I like this approach because it provides a host of easy to call helper methods that are all routed through one of these Send methods. 

Another major advantage of the HttpClient library over the HttpWebRequest class is related to the RequestHeaders class.  All of the Http header values have been wrapped with some kind of helper class that significantly eases the process of setthing header values.

Here are a few random examples of setting header information that pulled from the unit tests:

var req = new RequestHeaders();
req.Accept.Add(StringWithOptionalQuality.Parse("audio/*; q=0.2"));

req.Allow.Add("GET");
req.Allow.Add("POST"); 

var cc = new CacheControl();
cc.MaxAge = TimeSpan.FromSeconds(1);
cc.MaxStale = true;
cc.MaxStaleLimit = TimeSpan.FromSeconds(2);
cc.MinFresh = TimeSpan.FromSeconds(3);
cc.MustRevalidate = true;
cc.NoCache = true;

req.CacheControl = cc;

In addition to these headers that have strongly typed classes there is also the nifty Parse() function that is on both the request and response header collections.

var h = RequestHeaders.Parse("Pragma: LinkBW=2147483647, AccelBW=1048576, AccelDuration=5000"); 

var resp = ResponseHeaders.Parse(
                @"Set-Cookie: mkt1=norm=US; domain=.live.com; path=/
                    Set-Cookie: AFORM=NOFORM; expires=Mon, 20-Jul-2015 23:59:59 GMT; path=/".Trim());

var h = RequestHeaders.Parse(
               @"Accept-Charset: iso-8859-5, unicode-1-1;q=0.8
           Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0
           Accept-Language: da, en-gb;q=0.8, en;q=0.7");

var h = ResponseHeaders.Parse(@"WWW-Authenticate: Digest realm=""testrealm@host.com"",
                qop=""auth,auth-int"",
                nonce=""dcd98b7102dd2f0e8b11d0f600bfb0c093"",
                opaque=""5ccc069c403ebaf9f0171e9517f40e41""");

Who wants to write and test the code necessary to parse this stuff?  Not me, that’s for sure. Hopefully this has provided a taste for what this library is capable of and over the next few weeks I plan on writing a number of other posts that cover other interesting areas of the Microsoft.Http namespace.  To see where I am up to, see the summary page here

Why the Microsoft.Http library is awesome.

In various forums I a finding myself raving about this new library.  I decided that I should write some blog posts on the subject for a few reasons.  1) to convince myself that there is actually some substance to my raving; 2) provide a place to point people to in order to substantiate my infatuation and 3) maybe encourage a few more people to use it before it gets relegated to the dustbin of Microsoft’s forgotten projects.

Here are the posts I am planning.  I will create the links as the posts are completed.

You can get the library here and the official support forum is here.  From what I have been told, this library will not make it into the .Net 4 release but it is still being actively developed.

Oh Data

Since the recent PDC09 I have been obsessing over OData and I need to write this post just to get it out of my head.  Microsoft has made it obvious that they are taking this protocol very seriously by integrating it into Sharepoint, Visual Studio, RIA Services, PowerPivot, and I expect to see it in the next version of Office and in the Dynamics products.  I think it is a great direction to be headed but I also have concerns.

Let me start by quoting the most important paragraph from the document that explains how OData extends AtomPub.

AtomPub, as specified in [RFC5023], in combination with the extensions defined in this document, is appropriate for use in Web services which need a uniform, flexible, general purpose interface for exposing create retrieve update delete (CRUD) operations on a data model to clients. It is less suited to Web services that are primarily method-oriented or in which data operations are constrained to certain prescribed patterns.

Let me paraphrase that.  If all your service is going to do for your client is “CRUD” on generic data then OData is appropriate.  As long as everyone keeps this in mind going forward we should not run into too much trouble.  However, there is a problem with this statement.  REST is not really appropriate for doing CRUD.  Harry Pierson sums it up best here.  What is worse, is that I am seeing some of the people behind the OData spec equating OData with ODBC

The problem is that ODBC allows clients to initiate transactions across multiple requests. REST does not allow this as it would violate the stateless constraint.  REST does not need this because it is intended to address a completely different layer of the application architecture than ODBC.  REST provides a way to deliver Domain services.  I.e.  If you maintain weather data, REST provides you a easy way to expose “Today’s Weather”, “Last Week’s weather for Detroit”, “Average Rainfall in Orlando for the month of June”.  ODBC is aimed at the layer that exposes the data points for a specific place at a specific date and time. 

ODBC exposes dumb data, REST exposes intelligently presented information.

In an ODBC application it is the client that does something intelligent with the data before presenting it to the user.  In a REST application, usually the client simply makes the “intelligent information” pretty.  REST and ODBC are not comparable.

So is OData useful?  Absolutely it is useful to people who want to manipulate generic information, like for example Sharepoint lists, or data to feed into PowerPivot or Excel.  If you have a need to expose a generic data store to a client that will do graphing, statistical analysis, or some kind of visualization like rendering Mars Rover data then it could be very useful. 

However, if you want to provide a service that delivers intelligent information that is specific to a particular domain then I believe and apparently the authors of the spec believe that OData is not appropriate.

Beyond my fear of developers attempting to use OData for unintended purposes there are few other things that I think should be fixed in the OData spec. 

The Atom Entry content element should not use application/xml as the media type.  The content contains XML that is specifically related to the Entity Data Model and should be identified as such.  A media type such as application/EDM-Instance+xml may be sufficient.  What would be even better is if that content element contained a link to the CSDL file that defines the EntityType and that is currently accessed by constructing an URI with [Service]/$metadata.  Oh yeah, and maybe a precise media type on the metadata would be good too!

Client side URI construction is really nasty habit to get into.  I think for the most part, MS can get away with the construction of query parameters like $skip, $top, and $orderby, but to actually construct the path segments of a URI is just going lead to client-server coupling that will hurt in the future.

I haven’t read the entire OData spec in detail but it is interesting to see the complications that are introduced because they have not strictly followed the hypermedia constraint.  For example, it has become necessary to create custom HTTP headers to manage versioning of the “protocol” due to the use of client constructed URIs.  If those URIs were delivered as Links with rel attributes then the versioning would be limited to the media type of the content.  Yeah, I realize that you can’t create links for every combination of query parameter/ sub resource, but hey, I’m not the one saying that creating a REST interface for generic data was a good idea in the first place ;-) .

It is fascinating how difficult it is to beat the RPC mentality out of people.  Even though the OData spec is built on top of AtomPub, the authors have gone to great lengths to document the OData protocol in RPC terms and then map the RPC call to an HTTP request.  When you find yourself creating documentation titles like “RetreiveEntitySet Request”, “RetreiveEntity Request”, “RetreiveComplexType Request”, “RetreivePrimitiveProperty Request” and there is actually some valuable information that distinguishes one of those requests from the other then you are violating the uniform interface.  The idea is that the documentation can read “Use HTTP GET to retrieve representation of the resource from the URI”. Look at how simple the AtomPub spec is in comparison.

I think it is great that Microsoft have recognized the value of a RESTful protocol like AtomPub and they have taken the steps to incorporate this type of interface into many of their products.  I understand what they are trying to do with their URL construction techniques, I know exactly why they have introduced a MERGE verb and have created a batch request mechanism, because I too have been down this path before.  However, while there some areas they are justified in straying from the REST constraints, there are others that are definitely not and the protocol is suffering from it.

It is not that we don’t like the convenience of Windows Installer

I just had a fight with an installation of Sharepoint and it dawned on my what I don’t like about Windows Installer.  It is a black box.  You click on the file and press next, next, next and magic happens.  Personally, I would like to know where files are being put, which registry hives are being touched, are services being installed, are accounts being created.  Did something go in the GAC, was a COM DLL registered.

It should be easy to add some kind of UI that summarizes what parts of the system were touched by the install.  It would make me feel much more comfortable about installing stuff.

WCF REST Starter Kit Preview 2

I have been looking thought the latest release of the WCF REST Starter Kit and there are some interesting things there.  Especially the new HttpClient class in the Microsoft.Http namespace.  I will not talk about the server side of things because I really don’t have anything nice to say and until I have some constructive criticism to give I’m going to keep my mouth shut.

However, the new client side library is designed to make it easier to talk to HTTP endpoints and it seems to be quite nice.  It is 3500 lines of code, so there is a fair bit of stuff in there, layered on top of HttpWebRequest and I was curious to see what it could do.

I started by looking at the enclosed sample, unhelpfully named “WpfSample” and I got a little sidetracked from my exploration.  This sample is actually a very neat little http request builder tool.  I know this is a cool tool to have, because I have an almost identical one that we built to help us test REST services.  Testing with real web browsers is really annoying because they do all sorts of magic stuff behind the scenes, they are very picky about what content-types they will render and they keep trying to find a favicon! 

Anyway, I was curious to see how this tool made use of the HttpClient class but I was rather dismayed to see all of the code in the “code behind” of the Window1.xaml file.  This makes it really difficult to distinguish between what code is relevant to the web request and what is just UI gunk.  To add salt to the wound, it is a WPF app that uses no data binding and no Commands, just events manipulating the contents of UI elements.  It was a nice looking, handy WPF app that demonstrated this new spiffy HttpClient library with the architecture of a college student’s VB4 project.

Ok, I know it is just a sample.  However, everything that Microsoft puts out is viewed by people who are learning.  If Microsoft create samples with crappy architecture, why are we surprised when developers create applications with crappy architectures.

So despite the ton of work that I was supposed to be doing today, I refactored it.  I created an MVC type of structure with databinding and routed commands.  If you get the WCF REST Starter Kit and extract the zip file that gets put in the “C:\Program Files (x86)\Microsoft WCF REST” folder you will find a solution named Product.sln.  Just add my project which is in the zip file below and you will have, what I hope is a much easier to read version of the HttpClient sample.

http://www.tavis.ca/files/httpClientGui.zip

Bellware on Alt.Net

I just listened to the Alt.Net podcast where Scott Bellware talks about the state of Alt.Net.  Bellware has recently become a major thorn in the side of the Alt.Net community because he is claiming that they have become a social club that is no longer moving towards its original goals.

My understanding is that the original purpose of Alt.Net was to bring “alternatives” to the .net community and to spread the word to the developer masses.  Instead of just swallowing the latest toolset that Microsoft delivered, they want people to think about what is the best tool for the job and look outside of the normal community for solutions.

Scott’s fear is that the ALT.Net community is just replacing the Microsoft problem with another one.  It sort of reminds me of George Orwell’s Animal Farm.  A rebellion whose leaders become just as corrupt as the original ones they replaced.

The problem is that you can’t raise this alarm without treading on a whole lot of toes.  Bellware appears to have knack for stepping on toes really hard.

My fear is that I am starting to smell what he is describing.  I am starting to get the sense that if you are not doing it the “ALT.Net” way then you are not one of the cool kids.  To the point where there are attempts to squash dissension in the ranks.  At the Alt.Net conference in Seattle there was a session called “Is Persistence Ignorance necessary?”  There were a fair few people who showed up and it has been a pretty hot topic of the past 12 months.  On the Kyte.tv video stream of the session, Chad Myers makes the following comments

“Point of order: It is easy to do. “

“And has been solved “

“About 3-4 years ago, actually “

“(sigh) He’s re-learning (out loud) all the problem N/Hibernate folks solved years ago “

“I have PI and it works great. Why are we even talking about this again? “

I have two problems with this.  The first is that this appears to be a classic bully technique to eliminate any further discussion of the topic.  That does not seem in line with the Alt.Net philosophy as I understand it.

The second problem related back to what Scott is trying to convey on the podcast.  Alt.Net has to be open to the fact that 99% of developers don’t know why persistence ignorance can be beneficial.  To say, oh well it violates separation of concerns is like saying “thou shalt not kill”, because it says so in the Bible.  A session like this is exactly what Alt.Net has to do.  Reinforce the basics, over and over and over.  If members are going to say, we already know all about that, we don’t want to talk about it,  then you have immediately alienated all the “new recruits”.  You have an elitist club.

ALT.Net needs to focus the majority of its efforts on teaching people who are not in the Alt.Net community rather than being focused on self-learning.  If they continue to develop their own “state of the art” within the community they are just going to increase the divide between haves and the have nots.

From what I can tell, Scott has reason to be upset.

During the podcast Scott repeats incessantly his call for people to go out and do something to make a difference.  So, here are a couple of items that I am going to do:

  • I’m going to spend time on StackOverflow.com answering questions.  Stackoverflow is a magnet for junior devs looking for answers.  The Alt.net crowd should be all over that site, sharing their knowledge. 
  • I’m going to continue hiring interns from our local colleges and universities and mentoring them in our workplace.  We need to get the message out when developers are still learning. 

What are you going to do?

The mystery of the trailing slash and the relative url

I had heard conflicting rumours about the significance of the trailing slash, so I decided to go googling.  If you explore the first few hits you will find all sorts of discussions about cool urls, the impact on SEO, the performance benefits of avoiding server redirects, amongst other stuff.  However, I found nothing that seemed to have a critical impact on the functionality of a web application.

My interest in the trailing slash had been provoked by my attempts to use relative urls in some documents that I was returning from a REST server.  Sometimes the relative urls would work sometimes they wouldn’t.

The performance benefit discussion that I had found related to the fact that web servers will automatically redirect to an url with the trailing slash if the last segment points to a folder instead of file. 

e.g. A request to http://example.org/folder would actually be redirected to http://example.org/folder/default.htm (or whatever file is set as the default file)

As is probably obvious to anyone who has done any amount of web development, all relative urls processed by a web server are relative to the folder in which the containing file is located.  Mechanically, this means the last segment is of the url is stripped of and the relative url is added.  A detailed explanation of how a relative url is resolved can be found here.

The key point that I want to draw attention to is that the behaviour is pretty natural when you are pulling static files from folders on a web server.  It gets less obvious when you start using dynamically generated resources with the style of urls that are common in the REST world.

E.g.  Say you retrieve a resource such as http://example.org/customer/101 and it contains an element <link href="address"/>.  Common sense dictates that you are trying to get to http://example.org/customer/101/address, but that is not how it would be resolved.  The url would actually be resolved as http://example.org/customer/address

The problem is that the server no longer has a file/folder distinction, so the routing infrastructure doesn’t know whether it should append a slash to your url or not.

From what I’ve seen of Microsoft’s new URL routing infrastructure that is used by ASP.NET MVC and the WCF Rest toolkits, the url gets passed through with or without the slash depending on what the user typed.  This is a problem if you want to use relative urls in your representations.

I decided that I would include xml:base in documents so that at least I could be explicit about where my relative urls were relative to.  This way I don’t care if the source url ends with a slash or not.  I make sure that all my xml:base urls end in a slash and so far this strategy is working for me.