T-SQL Tuesday #102 – Giving Back

This month’s T-SQL Tuesday has been chosen by Riley Major and invites people to give back to the community

To give you a little background on me currently. I began working at a startup a couple of years ago. It’s a software house with a product that’s based on SQL Server on the back end. They were getting to a point where they’d never had a proper development DBA before and couldn’t put off having one for much longer. Enter the sandman.

One thing I’ve done here is to train people on the depths of SQL Server and to give people the tools to be able to write faster and more reliable code. I realised this year that I’ve spent a lot of time with people 1 on 1 which has been very effective but the downside to this is that it’s very time consuming on my part and it can be difficult for people to find some of my time to be able to train them.

I decided to commit to run training courses for one lunch time a month for at least a full year (bolded to hold me to this). I’ll then make them available to anybody who thinks they’d benefit from the training. We’re now part of a much larger company (as the startup got bought out) and have colleagues across the globe so making this training available to them was going to be essential. The first session was to a full room and the second (tomorrow) is looking to be just as busy so that’s good news. I’ve also made sure the recordings are available on our document management system so that remote employees or people who just couldn’t make it that day can watch it at their convenience.

I’ve got an idea of some future topics but I’m also making to take note of feedback and designing future sessions to meet what people think will be interesting.

I’ve also spoken at a couple of small events and I’m planning to use the ideas taken from the internal talks and submit sessions to local events (SQL Saturdays, User Groups etc) and I’ll update here when I find out whether my talks have been successful or not.

I also need to blog more, but that’s easier said than done….

T-SQL Tuesday #101 – Building your Emergency Toolkit

This month’s T-SQL Tuesday is being hosted by Jens Vestergaard. The idea is to speak about essential SQL Server tools and how you use them in your environment.

This is a subject that fascinates me. Over time I’ve built a set of tools that makes analysis of servers much easier and quicker. These are mostly community tools or tools inspired by other’s blog posts or stack overflow posts. I’m looking forward to checking everybody else’s posts on this subject.

I won’t re-hash what I’ve said previously but here’s the presentation I gave about this subject on GroupBy.org

And here’s a link to the page about it.

T-SQL 98 – My Technical Challenge – Clustered Index Conversion

This month’s T-SQL Tuesday subject has been selected by Arun Sirpal and is titled Your Technical Challenges Conquered. For my post I’m going to talk about the challenge of changing a Non-Clustered Primary Key into a Clustered one. This sounds relatively simple but gets more complicated as you delve into it.

First of all, what are the reasons for doing this? A table without a Clustered Index is called a HEAP. Now, heap tables do have their place but they’re not appropriate for most workloads in a relational database, they’re great if you need to insert a lot of data into a table quickly but reading from a heap is notoriously slow in most instances. Heap tables are stored on disk wherever SQL Server finds space, this can be spread around your disk. Clustered indexes are stored nicely ordered on disk and all of the data is stored together. You can monitor how many forward fetches you’re having from heap tables to see whether this particular issue is a problem for you.

Let’s assume that it is an issue and we want to add a clustered index to our table. A quick fix to this would be to just add a clustered index additionally to any other indexes we have on this table. This approach is a little messy though and isn’t the nicest way to fix this particular issue. There are exceptions, but generally if you have a Primary Key on a table then this will be your clustering key also (use cases may vary).

Unfortunately there isn’t a simple way to alter a table in order to turn a non clustered PK into a clustered one, this is because it’s such a fundamental change to how the data is stored on disk. In order to make this change you need to drop your existing primary key and then recreate it as a clustered primary key. Sounds simple right? Well, it can be, but it’s not as simple as that.

One thing about Primary Keys is that they can be used by Foreign Keys from other tables. You can’t drop a PK if there are FKs currently using it. Great, we’ll have to script all of these out, drop them, then recreate them after we’ve altered our table. We’re also going to assume that people have had the ability to alter our tables from whatever source control we have so we’ll have to do this dynamically.

One thing we wanted to check for was whether somebody had created a clustered index on the table but it was not on the primary key. In this use case I needed the clustered index to be on the primary key so we store the details of the other clustered index and drop that also.

You can get the sample scripts from my GitHub repo.

Script One – We’re just going to make some sample tables so that we can understand how this process works. Do not create these tables in a database that you care about (hands off production!).

Script Two – This shows the current status of our keys and how the tables relate to each other. Run this once you’ve run script one so you can see how it looks. Keep it around and we’ll run it after script three too. You should see something like this;

Take a look at the Foreign keys based on our main table and the clustered/nonclustered indexes. Check script one for more details on these.

Script Three – This is our script that’s going to actually do the work. Have a read through this script to get an idea of what it does. You’ll notice that I’ve commented out the execution statements. This is so that you’ll see what it’s planning to do in your messages tab in SSMS before it actually does anything. If you run it, you should see something like this;

As you can see, what it wants to do is to drop the two foreign keys first, it’s then going to drop the existing clustered index on our PKTest_Main table. It’s then going to recreate the primary key as a clustered one, once it’s done this it’s going to reapply the foreign keys. In this instance it’s also not going to reapply our old clustered index as it’s not necessary.

If you copy these messages into a new window and run them (or un-comment the exec statements in script three) it’s going to actually make the alterations above. Once this is done, the results from script two should look something like this;

You’ll notice that our primary key is now clustered and the unnecessary previous clustered index is no longer present. Lovely.

In production (and in change control) you’d probably want this as a stored procedure with the pass through parameters as @TableName = ” and (probably) @Execute = 1 (default 0) so that you don’t fire this proc by accident.

Let me know your thoughts on this, if you’d make any changes to any of the scripts.


Visualising the Marvel Cinematic Universe in T-SQL

This post is inspired by a blog post I saw here which showed an interesting way to visualise data. The example was not valid syntax for SQL Server so I thought I’d make a version that does work in T-SQL.

Let’s make some sample data. We’re going to use a list of all title characters (heroes with their own independent movies) in the MCU and which films they’ve played a role in.

CREATE TABLE #MCUData (Hero_Name nvarchar(40), Movie_Name nvarchar(40), Release_Year int)
INSERT INTO #MCUData (Hero_Name, Movie_Name, Release_Year)
 ('Iron Man', 'Iron Man', 2008)
,('The Hulk', 'The Incredible Hulk',2008)
,('Iron Man', 'Iron Man 2', 2010)
,('Thor', 'Thor', 2011)
,('Captain America', 'Captain America: The First Avenger', 2011)
,('Iron Man','Marvels The Avengers',2012)
,('The Hulk','Marvels The Avengers',2012)
,('Thor','Marvels The Avengers',2012)
,('Captain America','Marvels The Avengers',2012)
,('Iron Man', 'Iron Man 3', 2013)
,('Thor', 'Thor: The Dark World',2013)
,('Captain America', 'Captain America: The Winter Soldier',2014)
,('Star Lord', 'Guardians of the Galaxy',2014)
,('Iron Man','Avengers: Age of Ultron',2015)
,('The Hulk','Avengers: Age of Ultron',2015)
,('Thor','Avengers: Age of Ultron',2015)
,('Captain America','Avengers: Age of Ultron',2015)
,('Ant Man','Ant Man',2015)
,('Captain America','Captain America: Civil War',2016)
,('Iron Man','Captain America: Civil War',2016)
,('Thor','Captain America: Civil War',2016)
,('The Hulk','Captain America: Civil War',2016)
,('Ant Man','Captain America: Civil War',2016)
,('Spider-Man','Captain America: Civil War',2016)
,('Doctor Strange','Doctor Strange',2016)
,('Star Lord','Guardians of the Galaxy Vol. 2',2017)
,('Spider-Man','Spider-Man: Homecoming',2017)
,('Thor','Thor: Ragnarok',2017)
,('The Hulk','Thor: Ragnarok',2017)

Now, I wonder which character has appeared in the most movies. Let’s work that out. The nifty thing here is that we’re going to see the results in SSMS as a row graph chart, try this;

    ,REPLICATE('|',(COUNT(*)/CONVERT(decimal,m.Movies))*100) Graph
FROM #MCUData mcu
GROUP BY mcu.Hero_Name, m.Movies
ORDER BY mcu.Hero_Name

Take a look at the results pane, you can see that Iron Man and Thor are the most reoccurring Avengers with 35.3% movies each.

Please feel free to point out all of the reasons The list of characters & movies is wrong in the comments section below. I doubt I’d be able to stop you 🙂

Disk Speed Check

I regularly need to check the speed of disks on servers I jump on to. I found an excellent resource from a Microsoft Blogger named David Pless;


It uses a number of counters from sys.dm_io_virtual_file_stats most of which are in milliseconds and bytes so they need to be aggregated for readability.

The script wasn’t *exactly* what I was after so I modified it to suit my needs. Here’s a sample of the output;

You’re more than welcome to grab a copy of this from my github. Feel free to make a pull request if you feel anything can be improved by changing this;


Temp Tables and Named Constraints

While doing some testing on using NEWSEQUENTIALID() I came across an interesting issue that may come to bite us in the future. I was making a proof of concept of a basic script using temporary tables, something like this (I’ve taken out the unnecessary bits);

CREATE TABLE #NewIDTest (RowNum int, ID uniqueidentifier)


The whole test script worked just fine, until I sent it to a colleague to help him understand the concept. When he attempted to run the script he was met with the following error;

Msg 2714, Level 16, State 5, Line 4
There is already an object named ‘DF_NewIDTest_ID’ in the database.
Msg 1750, Level 16, State 0, Line 4
Could not create constraint or index. See previous errors.

Turns out that even on (local) temporary tables, constraint names still need to be unique. I ran the first line of the statement above to drop the table and it worked just fine for him. Something to keep in mind for future.


The Nightmare of Corruption

What do you do when you come across an error like this?



You look around and check if anybody else looks like they want their day ruined. I’m betting not, so it’s your problem. First thing is to check that you have DBCC CHECKDB running regularly and set up to email out when errors occur so you can jump on it right away, you do have those running right? If so, excellent, get checking which was your last good backup and restore that. Append your new data since that date and you’re perfect.

If not, this is going to hurt.

First you need to see what you’re working with. Run DBCC CHECKDB on the database that’s corrupted and check the extent of your issue, be careful, this is a very IO intensive operation and will have a serious performance impact. If the corruption is in non clustered indexes then you can just rebuild said indexes, run CHECKDB again and make sure it’s fixed your corruption issues.

In the example here, we’ve got clustered indexes that are corrupt. This is a little more difficult to fix.

Table error: page (1:24441410) allocated to object ID 2025058250, index ID 1
CHECKDB found 2301 allocation errors and 34 consistency errors in database 'CorruptedDatabaseName'.

Now all is not yet lost, there are a couple of steps to take. You can try running DBCC CHECKDB with REPAIR_REBUILD check out books online. This is going to take longer than a normal CHECKDB but definitely worth it. It will show how many errors were fixed, if this does it then great, close this webpage and go back to your real job.

In this example, it didn’t work. Now, the next option is going to result in data loss. You want to make sure you have a full backup before you run this and you want to run it inside a transaction before committing so that you can rollback if you need to. DBCC CHECKDB REPAIR_ALLOW_DATA_LOSS is going to try to do it’s job and fix the corruption in your database. Again, this is going to hammer your IO but at this point you’re probably just focused on fixing this corruption.

CHECKDB found 2301 allocation errors and 1882 consistency errors in database 'CorruptedDatabaseName'.
CHECKDB fixed 2301 allocation errors and 0 consistency errors in database 'CorruptedDatabaseName'.

As you can see, the corruption wasn’t fixed by this. This means that we’re going to have to take really drastic action on this one. Fortunately in this scenario this database only has 10 tables. Ultimately the way this was fixed was by taking an empty copy of the corrupt database, transferring the data over to our new database and deleting the original. Yes, data loss was a sad fact in this but there wasn’t another way.

Conclusion: corruption fixed

FizzBuzz – My Approach

I’m sure you’re aware of the classic FizzBuzz interview question. The issue has always been on a simple way of populating the 100 rows of data, I’ve seen approaches with temp tables, cursors and loops. I don’t like these types of approach as they cause unnecessary overhead.

Here’s my take on the problem;

    CASE WHEN a % 5 = 0 AND a % 3 = 0 THEN 'FizzBuzz' 
         WHEN a % 3 = 0 THEN 'Fizz' 
         WHEN a % 5 = 0 THEN 'Buzz' 
         ELSE CONVERT(varchar,a) END
      FROM sys.all_objects) a

Connect Item – Last Executed Time in SSMS

Something that has been bugging me for a while is that you can’t easily tell when the query in an SSMS window was last run. I’ve opened a connect item to see if we can get this addressed. I’d really appreciate if you could take the time to vote on it if you agree with the suggestion


Thanks in advance

Stack Dumps and Corruption

Had something I hadn’t had to deal with in a while appear yesterday on a client site, it’s something that we as DBA’s hope not to see, the dreaded Stack Dump. I’ll talk you through how I dealt with it.

Problem: Customer reported their hard drive was filling up on their SQL Server extremely fast and they were quickly running out of space.

Triage: A check later and there were 50GB of stack dump files appearing in their Log folder for SQL Server, now that’s something we need to look at. If your folder looks something like this then you’ve probably got a problem;

Yep, definitely going to need to check that...
Uh Oh

Checked the SQL Server error log and it was enormous with all of these stack dump errors. OK, let’s do this.

Treatment: The first place to look was those stack dump files to see what was going on. When a stack dump occurs SQL Server will create 3 separate files, a .txt, a .log and a .mdmp file. If you take a look in the txt file you will see a whole lot of data that’s only really useful to Microsoft support, but there’s some useful information in there if you take the time to dig through it. Take a look at the Input Buffer output, this contains 255 bytes of the query that’s being dumped, recognise it? In my scenario I had dealt with this query before so I could quickly find it but you may have to search through your code repository to find the query that’s being the victim of the dump.

Exhibit A – The Culprit

Now this is a query that runs regularly to keep some summary tables updated. Fortunately I was able to switch this off while investigating the issue, if you can’t switch it off then you’re going to have to clear out some of those dump files if you start running out of space while you investigate.

Now the bleed is stopped, time to work out what’s causing the dump. Take a look at the major and minor exception codes, a quick google shows that problem code 5242 is likely a corruption issue. An out of hours run of DBCC CHECKDB (it’s high impact, you don’t want to do this in production) showed that one particular index was showing as having corruption. Damn. Let’s try to fix that now shall we. CHECKDB gives us the object_id and index_id of the culprit, let’s use this information to find out which particular index in the issue;

USE [YourDatabaseName]

     o.name Table_Name
    ,i.name Index_Name
FROM sys.objects o
JOIN sys.indexes i
     ON o.object_id = i.object_id
WHERE o.object_id = 'YourObjectID'
     AND i.index_id = 'YourIndexID'

It’s a Non Clustered Index: If this is a non-clustered index then you’re golden. Navigate to the index in management studio, then you can create the drop and create script;

Bear in mind that this will affect the performance of your live system and may cause or be the victim of blocking. Once you’ve got a maintenance window and your users are aware then go ahead and run this script, it’s going to drop the index and create a fresh one using the corrected data. Once this has run, go ahead and run DBCC CHECKDB again to ensure that the corruption issue is fixed. If it is then great, you can clear down those stack dumps to free up hard drive space, tidy up after yourself and go get a cuppa and a biscuit and pat yourself on the back for a job well done.

It’s a Clustered Index: This is different, and a whole different beast. A clustered index is the table, it’s not just an index on the table, the clustered index is how it is stored on disk. Ultimately you’re going to have to restore your database from your last good backup (please restore to a different location, don’t instantly overwrite your production database) and retrieve your data that way. It’s not ideal but it is one of the few ways of recovering from this corruption.

Plus, remember, you always have the option of raising a ticket with Microsoft Support if you can’t fix it yourself. That’s what all of that other junk is for in the dump files.