Monday, March 26, 2012

Generating a Comma Separated List with an Implicit Recursion

As Dabblers, we love our data structured, ordered and indexed, but occasionally this runs up against the requirements of the data's consumers.  Developers who access our Microsoft SQL Servers will insist on working with comma separated lists, chopping up and concatenating data in their own, substandard, non-relational way.  In order to avoid dark mutterings, and to persuade them to continue to make us coffee, it is often useful to indulge them.

Returning nice structured recordsets as ugly comma (or tab, or bar) separated lists can seem onerous, and seems like a good candidate for a cursor.  As usual, cursors are the last things you need.  Instead, you can use a neat little trick that creates, in effect, a recursive structure that iterates through a recordset, by referencing a field within that recordset as a variable.

Here's some code:

Firstly, let's create some data


CREATE TABLE CSLTest (ID int, vval varchar(20))
GO
INSERT INTO CSLTest VALUES(1, 'abc'),(2, 'def'), (3, 'ghi')
GO

And here's the (rough and ready) code to get the comma separated list:


DECLARE @ CSLString  varchar(5000)
SELECT @ CSLString  = isnull(@ CSLString  + ', ','') + vval
FROM
(SELECT vval from CSLTest) A
SELECT @CSLString 

Let's walk through this, Dabblers.  There are two key elements here, both of them SELECT statements.  The inner select retrieves the main recordset that we will be working with. Crucially, however, it wraps it in parentheses, and gives it an alias.  Any use of Microsoft SQL Server will be familiar with aliasing, but some may not realise that it allows us to treat the aliased data as a single lump, even if it contains multiple rows.

The top level select is an example of us doing exactly this.  By referencing vval in the aliased data, we iterate through each record, and append it to the existing string (the isnull() is required to cope with the very first iteration, so that we don't add a comma at the beginning).

With a little bit of tweaking, we can put this functionality inside a User Defined Function.  This allows us to return the CSL from a function. Of course, most dabblers will be familiar with the Table Value functions available in newer versions of Microsoft SQL Server, but most developers have yet to see the light...


CREATE FUNCTION funReturnCSL
(@MinID int, @MaxID int)
RETURNS varchar(5000)
AS    BEGIN
      DECLARE @CSLString varchar(5000)
      SELECT @CSLString = ISNULL(@CSLString + ', ','') + vval
      FROM
      (SELECT vval FROM CSLTest WHERE ID BETWEEN @MinID AND @MaxID) A
      RETURN @CSLString
      END
GO

SELECT dbo.funReturnCSL(2,3)

Tuesday, March 13, 2012

The good RESEED

Identity fields in Microsoft SQL Server are a great tool for bringing order to chaos, providing a nice, predictable key to data that doesn't necessarily lend itself to a natural keys.  The problem is that, like taxis at a rank, you always have to take the next one that comes along.  This can be a problem for SQL Dabblers who want to bump the identity field up to a particular value.  Maybe you want all records created after a certain date to start at a particular nice round value, for instance.  Or you want to reserve particular values for testing or reference data.

Through the endless power of DBCC, you can do this.  The syntax is

DBCC CHECKIDENT ([yourtablename], reseed, [new value you want to start at])

(incidentally, Dabblers who are not familiar with DBCC should take some time to look into it.  It's exceptionally powerful, and rather dangerous - essentially the equivalent of opening SQL Server's casing, and hitting the innards with a spanner.)

So, that's all you need to know, right?

STOP!

As I mentioned, DBCC is powerful and dangerous, and the reseed operation is no exception.  Let's dabble in some code, and see what happens.


create table identitytest (ID int identity, vvalue int)
go
insert into identitytest (vvalue) VALUES (101),(102),(103)
select * from identitytest

The results here are what we expect.  The identity field starts at 1, and increments each time by 1, which is the default.


DBCC CHECKIDENT (identitytest, reseed, 10)
insert into identitytest (vvalue) VALUES (104) 

we've reset the identity value to 10, so the inserted record should have an ID of 10, right?

D'OH
select * from identitytest


11, eh?  So the thing to take away here is that after a reseed, the next inserted record gets an identity value of the reseed value + 1.  That's why we get 11 rather than 10.

BUT

This is only true for tables that have, or have had, records in them.  For a new 'virgin' table, the reseed behaves differently


create table identitytest2 (ID int identity, vvalue int)
DBCC CHECKIDENT (identitytest2, reseed, 10)
insert into identitytest2 (vvalue) VALUES (100)
select * from identitytest2



So, that only works with a brand new table, right?

NO!

Yeah, we're not done yet, dabblers.  Using TRUNCATE on a table causes SQL Server to view it as a 'new' table for the purposes of reseeding identity values.


truncate table identitytest2
DBCC CHECKIDENT (identitytest2, reseed, 20)
insert into identitytest2 (vvalue) VALUES (200)
select * from identitytest2

And you can see that we get 20, rather than 21.

So, to summarize,

DBCC CHECKIDENT ([tablename], reseed, x) means that the next row inserted will have an identity value of x+1 EXCEPT if the table has never had any rows in it EXCEPT if you truncate it first.  Have fun dabblers.

BUT HANG ON...

If you're following along on Microsoft SQL Server 2000 (no shame in that) then you're probably scratching your head right now (actually, you're probably calling me rude names, but I like to think well of my Dabblers).  This is because all the above is true only on later versions of SQL Server.  On 2000, reseeding will always give the next record the identity value of whatever you reseed as.

I've no idea what happens for those still using versions before 2000.  You guys really are weirdos :-D.

Saturday, February 25, 2012

Any equivalent Sunday...

Welcome, Dabblers, and let me begin with a small gift for you.  Our time is measured in a series of overlapping cycles - 60 minutes, 24 hours, 7 days, months that vary from 28 to 31 days in a year of 365 give or take.  Thus, handing dates can be a royal pain.  Apparently we have the ancient Egyptians to blame for this mess, but as they are no longer around to sue, we just have to fight our way through.

I recently had a requirement to generate a report comparing a set of figures to the nearest equivalent day on the previous year.  For example, the business wanted to compare figures for a two Thursdays, so pick a Thursday this year (say 23-02-2012) and find the nearest Thursday to the same date on the previous year (i.e. 24-02-2012).  This function returns you the date of that 'nearest equivalent day'.

create FUNCTION [dbo].[funGetSameDayLastYear] 

     (
     @inputdate date, -- the date which you wish to compare

     @yearoffset int = -1 -- the offset in years, e.g. -1 is last year, 3 is 3 years ahead etc.
     )
RETURNS smalldatetime

AS

BEGIN

     declare @inputDateDayOfWeek int
     declare @equivalentDateOnOffsetYear date
     declare @equivalentWeekdayOnOffsetYear int
     declare @dayOffset int
     declare @dateAdjustment int
     declare @adjustedDay smalldatetime

     SET @inputDateDayOfWeek =  datepart(dw,@inputdate)
     SET @equivalentDateOnOffsetYear = DATEADD(YY,@yearoffset,@inputdate)

     SET @equivalentWeekdayOnOffsetYear = DATEPART(dw,@equivalentDateOnOffsetYear)
     SET @dayOffset = @inputDateDayOfWeek - @equivalentWeekdayOnOffsetYear

     -- there is probably formula for this
     -- but it escapes me

     SELECT @dateAdjustment =
           CASE
                 WHEN @dayOffset = 0 THEN 0
                 WHEN @dayOffset IN (1,-6) THEN 1
                 WHEN @dayOffset IN (2,-5) THEN 2
                 WHEN @dayOffset IN (3,-4) THEN 3
                 WHEN @dayOffset IN (4,-3) THEN -3
                 WHEN @dayOffset IN (5,-2) THEN -2
                 WHEN @dayOffset IN (6,-1) THEN -1
           END

     SET @adjustedDay = dateadd(day, @dateAdjustment,@equivalentDateOnOffsetYear )

     RETURN(@adjustedDay)

END