Author: DB Dave

Identifying unique indexes that aren’t defined as unique

Confused? So am I, but after changing the title of this post about a dozen times this is what we’re left with… 🙂

I was in the process of examining some indexes on our test Denali server (see previous posts) when I decided to check how many of our non-unique, non-clustered indexes are actually made up of a unique column (or a unique combination of columns in the case of a compound index). In other words, which non-clustered indexes could I convert to UNIQUE non-clustered indexes? I whipped up a script to do this and thought I’d share.

If you’re wondering why I might want to do this; I’m interested to see what impact it has on index storage, as well as read performance. I’m by no means saying you should go and recreate all of your NC indexes as unique – remember that this script will return all indexes where the *current* data in the table is distinct. There are no guarantees that future inserts & updates won’t cause the data to become non-unique (and break your application).

Here’s the script. It’s pretty straight forward, and simply returns a list of tables/indexes at the moment, but could easily be change to actually output the CREATE / DROP_EXISTING commands instead.

SET NOCOUNT ON;
DECLARE @i INT = 1,
		@rc INT,
		@cmd NVARCHAR(2000),
		@tablename NVARCHAR(128),
		@indexname NVARCHAR(128),
		@collist NVARCHAR(1000),
		@rowcount INT,
		@rowcount_d INT;

DECLARE @tt_ncidxs TABLE (rowid INT IDENTITY, [object_id] INT, [object_name] VARCHAR(128), [index_id] INT, index_name VARCHAR(128), [index_columns] VARCHAR(1000), is_unique BIT);

WITH IndexColumns AS
(
	SELECT	DISTINCT
			i.[object_id],
			QUOTENAME(OBJECT_NAME(i.[object_id])) AS [object_name],
			i.[index_id],
			QUOTENAME(i.[name]) AS [index_name],
			STUFF(
					(
					SELECT	', ' + QUOTENAME(COL_NAME([object_id], column_id)) AS 'data()'
					FROM	sys.index_columns AS t2
					WHERE	c.[object_id] = t2.[object_id]
					AND		c.index_id = t2.index_id
					FOR		XML PATH ('')
					), 1, 2, ''
				) AS 'index_columns'
	FROM	sys.indexes AS i
	JOIN	sys.index_columns AS c ON c.[index_id] = i.[index_id] AND c.[object_id] = i.[object_id]
	JOIN	sys.objects AS so ON i.[object_id] = so.[object_id]
	WHERE	i.is_unique = 0
	AND		i.is_unique_constraint = 0
	AND		so.is_ms_shipped = 0
	AND		so.[type] = 'U'
)
INSERT	@tt_ncidxs ([object_id], [object_name], [index_id], [index_name], [index_columns])
SELECT	[object_id],
		[object_name],
		[index_id],
		[index_name],
		[index_columns]
FROM	IndexColumns

--SELECT * FROM @tt_ncidxs
SELECT @rc = @@ROWCOUNT;

WHILE @i <= @rc
BEGIN
	SELECT @cmd = NULL, @rowcount = NULL, @rowcount_d = NULL, @tablename = NULL, @collist = NULL;

	SELECT	@tablename = [object_name],
			@indexname = index_name,
			@collist = [index_columns]
	FROM	@tt_ncidxs
	WHERE	rowid = @i;

	RAISERROR('Table %s column list: %s', 10, 1, @tablename, @collist) WITH NOWAIT;

	SET @cmd = 'SELECT @x_rowcount = COUNT(*) FROM ' + @tablename + ' WITH (NOLOCK);'
	EXEC sp_executesql @cmd, N'@x_rowcount INT OUTPUT', @x_rowcount = @rowcount OUTPUT;

	SET @cmd = 'SELECT @x_rowcount_d = COUNT(*) FROM (SELECT DISTINCT ' + @collist + ' FROM ' + @tablename + ' WITH (NOLOCK) ) x;'
	EXEC sp_executesql @cmd, N'@x_rowcount_d INT OUTPUT', @x_rowcount_d = @rowcount_d OUTPUT;

	RAISERROR('Index %s on table %s selectivity = %i/%i', 10, 1, @indexname, @tablename, @rowcount_d, @rowcount) WITH NOWAIT;

	IF (ISNULL(@rowcount, 1) = ISNULL(@rowcount_d, 2)) -- i.e. column values are (currently) distinct/unique
	UPDATE @tt_ncidxs
	SET is_unique = 1
	WHERE rowid = @i;

	SET @i += 1;
END

-- Get a list of non-unique nonclustered indexes whose index columns are currently distinct.
-- These *may* be candidates for conversion to UNIQUE nonclustered indexes instead.
-- NB: PLEASE MAKE 100% SURE THAT THE COLUMNS ARE ALWAYS GOING TO BE UNIQUE BEFORE CREATING A UNIQUE INDEX
SELECT	*
FROM	@tt_ncidxs
WHERE	is_unique = 1
ORDER	BY [object_name], [index_id]

Cheers
DB Dave

2011-05-16

SQL Breakfast

This morning we (i.e. my team and I) tried something we’ve spoken about for some time, but haven’t got around to doing until now; a “Tech Breakfast”.

So I stopped at a bakery on the way to work to pick up some yummy baked goods, and all 3 of us piled into a meeting room at 07:30 with our individual beverage of choice in hand, and watched Kimberly Tripp’s “Indexing Internals” MCM presentation and demo.

It went down pretty well, so I think we’ll make it a regular thing (maybe every 2nd Friday morning). If you work in a team, and feel like mixing things up a little, give something like this a try. At the very least it exposes people to things they might not have otherwise learnt themselves, and also sparks discussion in the team about the material being viewed, and how it may be relevant in their own environments or projects.

All in all not a bad way to kick off a Friday.

Cheers,
DB Dave

2011-05-13
Our SQL Server “Denali” experiences – Part 2

Where has the week gone? I finally got round to continuing my SQL Denali testing today – with very promising results in fact!

To give you an idea of the environment (which I brushed over in the previous post) and a little more context, we have a number of SQL servers which sit between our web servers, and our main core databases. These act as “read-only” search, browse and caching servers for the Trade Me website. They each hold the same data, have the same configuration, and run on the same hardware, which makes it relatively easy to add new servers (i.e. scale horizontally) when necessary.

Add to the mix some clever networking kit, and a few in-house developed bits & pieces, and we can finely control the amount (and type) of load on each individual server. Very handy for testing new hardware and software, service packs, etc.

So, just after lunch time this afternoon, I very gingerly put 1% search load on the Denali test box. To cut a long story short, over the course of the next few hours I increased load until we were running 50% of all search load on the 1 Denali test server. Trust me when I say that this is pretty damn awesome. 🙂

The Denali test server is exactly the same hardware-wise as the existing SQL 2005 servers. We’ve tried on a few occasions to upgrade to SQL 2008, but a pretty well documented full-text related performance bottleneck meant we could never get more than 5-10% search load on a single box. The issue was with the full-text merge process causing excessive locking on the underlying indexed table – which in our environment is constantly changing, as well as being constantly queried – this meant we’ve just stuck with 2005 thus far.

Although everything went very well, there were a few weird things I ran into while getting the server ready; firstly I needed to shrink a tempdb data file that had blown out, but the shrinkfile operation never completed (and actually timed out once or twice after around 5 minutes which I’ve never come across before). Because I couldn’t shrink the data file I thought I’d just restart the SQL service, which would recreate the tempdb at its original (i.e. smaller) size, but this wasn’t the case. The initial size was correctly set, but after a restart of the SQL service the tempdb file size was unchanged. I’ll need to investigate this further next week, so will post any new findings.

That’s all for this week’s testing.
Cheers
DB Dave

2011-05-12
Snapshot isolation and tempdb growth

Over the weekend we were alerted to a server running low on disk space. Not particularly exciting in itself, but the fact that it was happening on a relatively newly build SQL 2008 database server with a small database (~2.5GB) was interesting.

This particular database belongs to one of our newer and smaller websites, and has READ_COMMITTED_SNAPSHOT enabled (we did this to prevent excessive blocking, which we ran into and weren’t able to control many other ways since this site is built using an ORM which generates the SQL commands on the fly).

Anyway, getting back to the point; disk space was low because the tempdb data file was using around 15GB of space. I assumed that someone was explicitly doing something to cause the blowout (index rebuilds, gnarly query, etc.) but I couldn’t find any culprits straight away.

I then checked the sys.dm_tran_session_transactions DMV (after reading through one of Raj’s posts) which showed one particularly long running session (over 14 days in fact!). There were no open transactions that I could see, and the session itself had been suspended for practically all of those 14 days, so the reason I hadn’t seen it earlier was that most of my queries were inner-joining to sys.dm_tran_session_transactions which didn’t have a corresponding row. The query had a shared schema lock on 2 tables though, which is why the version store wasn’t flushing I guess.

I had thought that the version store was only maintained while there was an open transaction though – so either there was an open transaction and I simply didn’t see it (I ran “DBCC OPENTRAN” with no results) or it was enough that the session held schema locks. Anyone have any ideas on this?

The last command executed by the SPID was a relatively common & trivial one from the website, so it looks like it was an orphaned connection from the application – although how that happened we’re not sure. I killed the SPID and was then able to shrink the tempdb down to its usual level.

I haven’t had much exposure to snapshot isolation, but at least now know to check for this if we have problems with tempdb growth in future. 🙂

Cheers
DB Dave

2011-05-09
Our SQL Server “Denali” experiences – Part 1

I got around to installing SQL Server 2011 (Denali) on a production server last week, and fortunately it’s been pretty slick so far. And in case you’re wondering why it’s on a production server instead of a dev or test box, this particular SQL server is one of many that we use as a “read-only” layer between the webs and our core databases, so it’s actually pretty easy for us to add a new production server and siphon a small amount of load onto it. We can then slowly ramp up the amount of production load to see how it performs. If it turns to custard, we can quickly remove load from it as well.

To start with I had to subscribe the database to a few of our core SQL 2005 transactional replication publications, which was (fortunately) totally uneventful.

We use full-text quite extensively, and going from SQL 2005 to Denali meant that I needed to create a new data file and filegroup for the full-text index to use. I didn’t have to create a new physical file for the full-text filegroup, but we put our full-text catalogues on RAM drives, so I needed to be able to move it. Once that was done, setting up full-text itself wasn’t very different from SQL 2005 (other than needing to specify a filegroup destination rather than a directory path).

Once replication latency had come down, and the full-text catalogue was fully populated (which didn’t take long at all), I needed to synchronise the database objects with another production server to make sure there were no differences such as missing tables and indexes, different version of stored procedures, etc. This was the biggest hurdle so far, since our current comparison tool (Red Gate SQL Compare) doesn’t yet support Denali. I tried a few alternative products, and eventually found a tool called “SQL Delta” which worked! It’s only an evaluation copy though, so for the next 14 days I’ll be okay… after that I’ll need to figure something else out. 🙂

That’s as far as I’ve managed to get so far, mostly because actual work keeps getting in the way of the cool stuff (like testing new hardware, new versions of SQL, etc.), so I’ll follow up next week with another post on how the first round of testing goes.

Now go on and read part 2.

Cheers
DB Dave

2011-05-09
Where am I?

Well, right now I’m sitting in a hospital room waiting for my wife to come out of surgery – nothing too serious. In the mean time it feels like I’ve got a rather nice little office in fact; a comfy(ish) chair and desk in a private ward with good free wi-fi access, and coffee just down the hall.
But I’m not here to write about my current physical surroundings, so, moving swiftly along…

Many of us regularly connect to more than one SQL server instance, even if it’s just your dev/test instances, and production. This means we need to be sure of which servers we’re connecting to (although chances are it doesn’t cross your mind).

A couple of folks here in the database team have, in the past, come across some strange behaviour with Management Studio where the bottom right-hand side of the query window will display a server/instance name, but running a select @@servername returns a completely different server name!

We’ve also seen a case where scripting from the GUI (I think we were scripting out a shrink file command) created the script as usual, but the query was connected to a different server to the one we were scripting from.

We never nailed down exactly what caused the issues, but I’ve never seen it myself with the latest patched version of SQL 2008 tools, so hopefully it was a rare bug that has since been fixed.

Nevertheless, the thought of unintentionally truncating or dropping some production tables doesn’t fill me with warm fuzzy feelings, so I’m a little paranoid now. If I’m running something potentially risky I will often check the server name, just in case. I’ve set up “select @@servername” as a keyboard shortcut in Management Studio’s keyboard settings, so I just hit CTRL-0, and the server name pops up. Easy. 🙂

Right, I’m off to find this mythical free coffee machine…

Cheers
DB Dave

Edit: I’ve just found a blog post from Jason Jarrett from a few years ago, who seems to have come across a similar (if not the same) problem with SQL 2008 SSMS. There’s also a link to a Connect item where Microsoft have commented (the old “it’ll be fixed in a future version” gem).

2011-05-06

Author: DB Dave

Identifying unique indexes that aren’t defined as unique

SQL Breakfast

Our SQL Server “Denali” experiences – Part 2

Snapshot isolation and tempdb growth

Our SQL Server “Denali” experiences – Part 1

Where am I?