Recipe for a Brittle Data Service

You can teach people a lot by writing about best practices.

But sometimes, you can also teach by illustrating your own boneheaded moves that tripped you up. In other words, don’t do what I did.

Case in point is a recent issue I ran into when attempting to demo some HTML5 and JS code at last week’s CMAP main meeting. I had built a simple HTML5 site for consuming event data and the twitter stream for Community Megaphone. The twitter feed was working fine, but the event data, which was being populated dynamically from the Community Megaphone WCF Data Service via a call to jQuery‘s .ajax function, wasn’t working at all.

Unfortunately, I discovered this only a few minutes before the start of the meeting. The site had worked fine the last time I demoed it, so I hadn’t felt a strong need to test it again. Oops!

While I was trying to troubleshoot before the meeting got started, I got as far as discovering that the data service was throwing a 500 error, but wasn’t able to dig in further until I had a bit more time the following day. Those of you who have worked with WCF Data Services are probably nodding your head in recognition. One of the choices that the data services team made to enhance security is to provide very little information when things go wrong. That’s great for security because one of the main ways that the bad guys figure out how to exploit systems is by attempting to make them fail, and gleaning information from error messages returned. Unfortunately, what’s great for security is lousy for troubleshooting, particularly if you don’t have immediate access to the source for the failing service.

The good news is that while WCF Data Services is tight-lipped by default, with a few simple tweaks, you can get it to open up about what’s causing the problem. Once I was back at my home office, sitting in front of the code for the service, here’s what I did to troubleshoot, per this post that’s been out for a while:

Opened up the code-behind for the Data Service, and added the ServiceBehavior metadata attribute to the DataService class, with the IncludeExceptionDetailInFaults property set to True (note that you can also configure this via the
configuration element in your behaviors in web.config):
```
<System.ServiceModel.ServiceBehavior(IncludeExceptionDetailInFaults:=True)>
```

As part of the InitializeService method, set the service to use verbose errors, which will provide additional detail on the error:
```
config.UseVerboseErrors = True
```

Once you’ve made those changes (hopefully against a local or staging instance of your app, as you generally want to avoid sending detailed error messages in a production app), you should be able to see the full exception information, including the stack trace, and any inner exceptions. In my case, once I’d enabled these handy features, I saw the following when making a request to the service operation that was failing:

xml version="1.0" encoding="utf-8" standalone="yes"?>
<error xmlns="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
  <code>code>
  <message xml:lang="en-US">An error occurred while processing this request.message>
  <innererror>
    <message>An error occurred while reading from the store provider's data reader. 
See the inner exception for details.message>
    <type>System.Data.EntityCommandExecutionExceptiontype>
    <stacktrace>   at System.Data.Common.Internal.Materialization.Shaper`1.StoreRead()
   at System.Data.Common.Internal.Materialization.Shaper`1.SimpleEnumerator.MoveNext()
[...]
   at System.Data.Services.DataService`1.HandleRequest()stacktrace>
    <internalexception>
      <message>Invalid length parameter passed to the SUBSTRING function.message>
      <type>System.Data.SqlClient.SqlExceptiontype>
      <stacktrace>
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
[...]
   at System.Data.SqlClient.SqlDataReader.Read()
   at System.Data.Common.Internal.Materialization.Shaper`1.StoreRead()stacktrace>
    internalexception>
  innererror>
error>

I’ve omitted some of the stack trace info to keep it a bit more brief, but the key bit is the type of the outer exception (EntityCommandExecutionException) and the message of the inner exception (“Invalid length parameter passed to the SUBSTRING function.”). With this info in hand, I was able to track the issue down to the SQL Server Stored Procedure I wrote to handle queries for events within a given distance of either a latitude/longitude pair, or a zip code (which the data service converts to a lat/long on the fly).

Turns out that the issue was poor logic within the stored procedure. Because Community Megaphone supports both in-person and online events, I needed a way to return only those events which were within the given distance (calculated by a user-defined function), which required parsing the latitude and longitude of each event, which are stored as a single column in the db (whether that was wise is a discussion for another day). In the original sproc, I did this by checking whether the value of the State column was either blank or ‘NA’ (as in not applicable). This worked fine for a very long time, but apparently subsequent to writing the sproc, I must have made a modification to some other code that allowed events to be added with a value other than blank or ‘NA’, and as a result, there was an online event which appeared to the sproc to be an in-person event. And when it tried to parse the non-existent lat/long value, it threw the exception above.

Fixing the data error was easy enough once I knew the cause, but that would only fix the problem for that one event. Fixing the sproc was pretty simple, too. Since the data that I’m parsing is the latlong column, then I just needed to make sure that there was a value in that column before trying to parse it. In .NET code, you’d simply call String.IsNullOrEmpty(stringval), but T-SQL doesn’t have a native IsNullOrEmpty function, so what to do?

A quick Bing search, and the answer was found in this blog post, which shows how to add your own IsNullOrEmpty user-defined function:

CREATE FUNCTION dbo.IsNullOrEmpty(@text NVARCHAR(4000))
RETURNS BIT
AS
BEGIN
    IF ISNULL(@text, '') = ''
    BEGIN
        RETURN 1
    END
    
    RETURN 0
END

GO

If @text is an empty string, it will match ”, and return 1 (true), if it’s null, ISNULL will substitute ” for its value, and again, it will return 1. For any other value, the function will return 0 (false). With the UDF in place, my logic for parsing the latlong is more robust:

CASE
   WHEN dbo.IsNullOrEmpty(E.latlong) = 0 THEN LEFT(E.latlong, CHARINDEX(',', E.latlong) - 1)
   ELSE '0'
END,
CASE
   WHEN dbo.IsNullOrEmpty(E.latlong) = 0 THEN RIGHT(E.latlong, LEN(E.latlong) - CHARINDEX('-', E.latlong) + 1)
   ELSE '0'
END

The above T-SQL, which is part of the query that returns the matching events, will only attempt to parse the latitude and longitude values if they exist. Otherwise, it will return ‘0’ for the values.

A couple of things to note:

The error message I was receiving didn’t mention either LEFT or RIGHT, but rather SUBSTRING. That made it a little trickier to figure out what was going on, but thankfully, the sproc was only 32 lines long, so it wasn’t hard to suss out where the problem was. This is an excellent argument, IMO, for keeping your sprocs small and tight, and moving any logic that isn’t specific to the sproc itself into user-defined functions or other appropriate locations. As with methods in your classes, the fewer responsibilities a sproc has, the easier it’ll be to troubleshoot when it breaks.
One thing I did not take advantage of in my stored procedure code (since remedied) is SQL Server’s support for TRY/CATCH blocks. By handling potential exceptions closest to their source, I can make my stored procedure (and hence my data service) more robust, and make it easier to determine what’s going on. I simply wrap the potentially problematic code with a BEGIN TRY / END TRY construct, and follow it up with a BEGIN CATCH / END CATCH. Inside the catch block, I can return my own custom error message, log the error, or handle it however I choose. In the case of this specific sproc, an exception will result in the service not returning any data, but that’s OK, as it will at least return, rather than hanging the ajax call.

Summary

So to wrap up, here are a few tips and points to remember when working with stored procedures and data services:

Keep in mind that by default, WCF Data Services does not return detailed exception information. This is a good thing, until you need to troubleshoot an exception…then you’ll want to remember config.UseVerboseErrors and IncludeExceptionDetailInFaults.
WCF Data Services exceptions will not be caught by ASP.NET’s Custom Error feature even if it’s enabled. As such, if you’re counting on Custom Errors to catch any problems you have in your code that don’t have specific exception handling written, you’ll be disappointed here. Make sure that your stored procedures (if used) and your data services code both have adequate exception handling in place, so that your service doesn’t fail completely when an exception arises.
Avoid inferring the existence of one column or value based on the value of another column…this was the root error on my part that led to my service crashing in the first place. While more robust exception handling would’ve mitigated the impact of the bug, testing for the value of the column I was actually trying to parse would’ve avoided the whole problem.
And a general tip for fellow speakers…ALWAYS check your demos BEFORE you get to the venue. Finding out about this problem even just a few hours earlier would likely have allowed me to fix it, rather than hack around it. I’ve been doing this long enough that I should know better, but clearly I needed a pointed reminder for why you don’t just assume that your demos that worked last week still work this week.

Hope you found this useful…if you did, please share this post with your friends!