How Not to Future-Proof an Application

I was recently troubleshooting a reported application crash for a client along with a fellow engineer from the client (let’s call him Bob).  A message box along the lines of “An Error Happened, Good Luck” showed up when a user clicked a certain button. Google searches for problems with this application typically turn up one post from 2003 that’s totally unhelpful.


http://xkcd.com/979/

This application is a behemoth comprised of some VB6, .NET and who knows what else running against SQL Server 2000.  Bob and I have some experience with this application, so we started a SQL Profiler trace to look at what was happening when the user clicked “The Button”.  As it turns out, when clicked, it was executing a SQL statement that looked a lot like this:

SELECT * FROM SomeDataTable WHERE SurrogateKey =  ORDER BY SomeDataField

So if you know anything about SQL (spend the next few hours here if you don’t), this statement is wrong. Knowing the application, it looked suspiciously like there was code somewhere like this:

command = "SELECT * FROM SomeDataTable WHERE SurrogateKey = " + SurrogateKeyParam + " ORDER BY SomeDataField"

SurrogateKeyParam was obviously empty for some reason. So Bob started running some queries to look at the data in SomeDataTable, which looked like this:

SurrogateKey CreateDate DataField1 DataField2 DataField3 etc…
10000009 12/6/2012  
10000010 12/6/2012  
10000011 12/6/2012  
10000012 12/6/2012  

I halfway jokingly said to him, “Surely that surrogate key isn’t too big for some reason”. I think we both laughed in a scary “That might really bit it” sort of way, and then proceeded to run the numbers in our heads to see if 10,000,000 crossed some computer science wizard boundary somewhere. It was much too big to have crossed Int16 or UInt16 recently, and well below Int32’s maximum value. So we went back to the data from the day when this issue was first reported. It looked something like this:

SurrogateKey CreateDate DataField1 DataField2 DataField3 etc…
9999998 12/5/2012  
9999999 12/5/2012  
10000000 12/5/2012  
10000001 12/5/2012  

Based on the title of the post, you may already know where this is going. The issue was first reported on 12/5, and the values went from 7 to 8 digits on that day. We were both getting a “That can’t be a coincidence” feeling at this point, and Bob, in his East Coast accent, said something to the effect of “That can’t really be it”. We have both known there has to be some unplanned obsolescence in this application somewhere, and we were both hoping this wasn’t it.

So off to the “Test” environment we went. “Test” was well below reaching 8 digits for this value, and clicking “The Button” worked without a problem. Since the surrogate key values are allocated from a sequence table, we artificially set the value up to 9,999,999 and then clicked “The Button”. BAM! The “An Error Happened, Good Luck” message appeared. Confirmation of our worst fear, somewhere in the vast expanse of the GUI code mess that is this application, there was a variable or statement that was arbitrarily limiting the value of SurrogateKey to a length of 7 characters for who-knows-what reason. However, it wasn’t just trimming the value, somehow it was ending up NULL or empty.

Since we don’t have the source, we can only guess this is the result of incorrect data types and/or maximum lengths on an ADO Recordset, or worse (I guess) an = LEFT(SurrogateKey,7) is somewhere in the code. Who knows what it is really.

So after doing a quick search to see if anyone else was hiring so we could both quit before the helpdesk called back to check on our progress, we realized the data in this table is only used temporarily and then deleted.  Nothing else outside the process that “The Button” is used for references the SurrogateKey column.  Our fix was as easy as writing the following query:

UPDATE SequenceTable SET SurrogateKey=1 WHERE TableName=’SomeDataTable’

We also made the necessary sacrifices to the volcano in the event this breaks something, and may the Lords of Kobol help us when (not if), we see this error on a table that needs referential integrity. In the past, tinkering in the database manually has caused numerous problems, but we had no other resolution.

So next time you are thinking to yourself, “I’ll just shove these Int32 values into 7 character strings”, don’t. Plan for your application to be used 5 years past the “Surely they will upgrade by now” or “surely they will never get to X” date. This one is going on 6 years past.