SQL User Group Wrap Up

At Tuesday's Cincinnati SQL User Group meeting, Maria Stone of Strategic Data Systems (SDS) gave a presentation titled Advanced Transact-SQL Programming in SQL Server 2005.  As you might guess from the title, it was a detailed, low-level presentation.  She covered five topics that she described as most useful in day-to-day work: Try-Catch, recursive queries, temporal data, ranking and pivots.

Structured error handling, or Try-Catch has been sorely needed in T-SQL for quite some time and is a capability we software developers just assume will be in the language we're using.  Prior to SQL 2005, to ensure that stored procedures ran successfully, you needed to check after every statement for failure.  Beginning with 2005, you can just use the Try-Catch statement.

She also discussed using the XACT_STATE function to determine if there is a current transaction and whether it can be committed. Using XACT_STATE you can more specifically control what happens when a failure occurs.

Maria next discussed recursive queries or Common Table Expressions.  She gave several examples but the most illuminating was writing a query to determine the number of rivets necessary to build a car.  The sample table she used had three columns: part, subpart, and quantity.  Using a Common Table Expression, she wrote one query to recursively join the table to itself and total the number of each item necessary to construct the car.

She then addressed potential logic and performance issues with temporal data, or date-time columns.  She identified a common mistake that developers make, which is converting a date-time to a string in a where clause.  The primary issue with this is performance because it will likely prevent SQL from using an index.  Rather than converting date-time value to strings, she recommends using the binary comparisons provided by the BETWEEN, less than and greater than operators.  She also talked about the common logic error of assuming a date-time value stores only a date and not a time.  Every date-time value in SQL has an associated time, whether you specify one or not.   So if you're trying to select all of the rows for a particular date, you need to do something like:

SELECT * FROM foo WHERE SomeDate BETWEEN '2/13/2008' AND '2/14/2008'

to select all rows with SomeDate values falling on 2/13.

Maria next discussed the ranking functions: RANK, DENSE_RANK, NTILE, and ROW_NUMBER.  RANK and DENSE_RANK are similar and assign a rank to a row.  Both allow ties.  The difference between them is that DENSE_RANK guarantees consecutive ranks (without gaps), and RANK does not.  NTILE can be used to group rows in percentiles.  And ROW_NUMBER does what you would think, which is to number rows consecutively (which means there are no ties.)

The last item Maria covered was the PIVOT and UNPIVOT commands, both of which are used to translate columns to rows and rows to columns.  The PIVOT command converts normalized data to denormalized data and UNPIVOT converts denormalized data to normalized data.  PIVOT does what has ben made (in)famous by Excel Pivot Tables, in a more static way.  You can use UNPIVOT to normalize data from a table that has months for columns and return one row for each month.

The presentation and supporting sample SQL files should soon be available on the Cincinnati SQL User Group site.  They also announced that  Craig Utley will be speaking at the Business Intelligence (BI) SIG on March 14.

0 comments: (+add yours?)