Monday, March 26, 2012

Generating a Comma Separated List with an Implicit Recursion

As Dabblers, we love our data structured, ordered and indexed, but occasionally this runs up against the requirements of the data's consumers.  Developers who access our Microsoft SQL Servers will insist on working with comma separated lists, chopping up and concatenating data in their own, substandard, non-relational way.  In order to avoid dark mutterings, and to persuade them to continue to make us coffee, it is often useful to indulge them.

Returning nice structured recordsets as ugly comma (or tab, or bar) separated lists can seem onerous, and seems like a good candidate for a cursor.  As usual, cursors are the last things you need.  Instead, you can use a neat little trick that creates, in effect, a recursive structure that iterates through a recordset, by referencing a field within that recordset as a variable.

Here's some code:

Firstly, let's create some data


CREATE TABLE CSLTest (ID int, vval varchar(20))
GO
INSERT INTO CSLTest VALUES(1, 'abc'),(2, 'def'), (3, 'ghi')
GO

And here's the (rough and ready) code to get the comma separated list:


DECLARE @ CSLString  varchar(5000)
SELECT @ CSLString  = isnull(@ CSLString  + ', ','') + vval
FROM
(SELECT vval from CSLTest) A
SELECT @CSLString 

Let's walk through this, Dabblers.  There are two key elements here, both of them SELECT statements.  The inner select retrieves the main recordset that we will be working with. Crucially, however, it wraps it in parentheses, and gives it an alias.  Any use of Microsoft SQL Server will be familiar with aliasing, but some may not realise that it allows us to treat the aliased data as a single lump, even if it contains multiple rows.

The top level select is an example of us doing exactly this.  By referencing vval in the aliased data, we iterate through each record, and append it to the existing string (the isnull() is required to cope with the very first iteration, so that we don't add a comma at the beginning).

With a little bit of tweaking, we can put this functionality inside a User Defined Function.  This allows us to return the CSL from a function. Of course, most dabblers will be familiar with the Table Value functions available in newer versions of Microsoft SQL Server, but most developers have yet to see the light...


CREATE FUNCTION funReturnCSL
(@MinID int, @MaxID int)
RETURNS varchar(5000)
AS    BEGIN
      DECLARE @CSLString varchar(5000)
      SELECT @CSLString = ISNULL(@CSLString + ', ','') + vval
      FROM
      (SELECT vval FROM CSLTest WHERE ID BETWEEN @MinID AND @MaxID) A
      RETURN @CSLString
      END
GO

SELECT dbo.funReturnCSL(2,3)

Tuesday, March 13, 2012

The good RESEED

Identity fields in Microsoft SQL Server are a great tool for bringing order to chaos, providing a nice, predictable key to data that doesn't necessarily lend itself to a natural keys.  The problem is that, like taxis at a rank, you always have to take the next one that comes along.  This can be a problem for SQL Dabblers who want to bump the identity field up to a particular value.  Maybe you want all records created after a certain date to start at a particular nice round value, for instance.  Or you want to reserve particular values for testing or reference data.

Through the endless power of DBCC, you can do this.  The syntax is

DBCC CHECKIDENT ([yourtablename], reseed, [new value you want to start at])

(incidentally, Dabblers who are not familiar with DBCC should take some time to look into it.  It's exceptionally powerful, and rather dangerous - essentially the equivalent of opening SQL Server's casing, and hitting the innards with a spanner.)

So, that's all you need to know, right?

STOP!

As I mentioned, DBCC is powerful and dangerous, and the reseed operation is no exception.  Let's dabble in some code, and see what happens.


create table identitytest (ID int identity, vvalue int)
go
insert into identitytest (vvalue) VALUES (101),(102),(103)
select * from identitytest

The results here are what we expect.  The identity field starts at 1, and increments each time by 1, which is the default.


DBCC CHECKIDENT (identitytest, reseed, 10)
insert into identitytest (vvalue) VALUES (104) 

we've reset the identity value to 10, so the inserted record should have an ID of 10, right?

D'OH
select * from identitytest


11, eh?  So the thing to take away here is that after a reseed, the next inserted record gets an identity value of the reseed value + 1.  That's why we get 11 rather than 10.

BUT

This is only true for tables that have, or have had, records in them.  For a new 'virgin' table, the reseed behaves differently


create table identitytest2 (ID int identity, vvalue int)
DBCC CHECKIDENT (identitytest2, reseed, 10)
insert into identitytest2 (vvalue) VALUES (100)
select * from identitytest2



So, that only works with a brand new table, right?

NO!

Yeah, we're not done yet, dabblers.  Using TRUNCATE on a table causes SQL Server to view it as a 'new' table for the purposes of reseeding identity values.


truncate table identitytest2
DBCC CHECKIDENT (identitytest2, reseed, 20)
insert into identitytest2 (vvalue) VALUES (200)
select * from identitytest2

And you can see that we get 20, rather than 21.

So, to summarize,

DBCC CHECKIDENT ([tablename], reseed, x) means that the next row inserted will have an identity value of x+1 EXCEPT if the table has never had any rows in it EXCEPT if you truncate it first.  Have fun dabblers.

BUT HANG ON...

If you're following along on Microsoft SQL Server 2000 (no shame in that) then you're probably scratching your head right now (actually, you're probably calling me rude names, but I like to think well of my Dabblers).  This is because all the above is true only on later versions of SQL Server.  On 2000, reseeding will always give the next record the identity value of whatever you reseed as.

I've no idea what happens for those still using versions before 2000.  You guys really are weirdos :-D.