T-SQL 98 – My Technical Challenge – Clustered Index Conversion

This month’s T-SQL Tuesday subject has been selected by Arun Sirpal and is titled Your Technical Challenges Conquered. For my post I’m going to talk about the challenge of changing a Non-Clustered Primary Key into a Clustered one. This sounds relatively simple but gets more complicated as you delve into it.

First of all, what are the reasons for doing this? A table without a Clustered Index is called a HEAP. Now, heap tables do have their place but they’re not appropriate for most workloads in a relational database, they’re great if you need to insert a lot of data into a table quickly but reading from a heap is notoriously slow in most instances. Heap tables are stored on disk wherever SQL Server finds space, this can be spread around your disk. Clustered indexes are stored nicely ordered on disk and all of the data is stored together. You can monitor how many forward fetches you’re having from heap tables to see whether this particular issue is a problem for you.

Let’s assume that it is an issue and we want to add a clustered index to our table. A quick fix to this would be to just add a clustered index additionally to any other indexes we have on this table. This approach is a little messy though and isn’t the nicest way to fix this particular issue. There are exceptions, but generally if you have a Primary Key on a table then this will be your clustering key also (use cases may vary).

Unfortunately there isn’t a simple way to alter a table in order to turn a non clustered PK into a clustered one, this is because it’s such a fundamental change to how the data is stored on disk. In order to make this change you need to drop your existing primary key and then recreate it as a clustered primary key. Sounds simple right? Well, it can be, but it’s not as simple as that.

One thing about Primary Keys is that they can be used by Foreign Keys from other tables. You can’t drop a PK if there are FKs currently using it. Great, we’ll have to script all of these out, drop them, then recreate them after we’ve altered our table. We’re also going to assume that people have had the ability to alter our tables from whatever source control we have so we’ll have to do this dynamically.

One thing we wanted to check for was whether somebody had created a clustered index on the table but it was not on the primary key. In this use case I needed the clustered index to be on the primary key so we store the details of the other clustered index and drop that also.

You can get the sample scripts from my GitHub repo.

Script One – We’re just going to make some sample tables so that we can understand how this process works. Do not create these tables in a database that you care about (hands off production!).

Script Two – This shows the current status of our keys and how the tables relate to each other. Run this once you’ve run script one so you can see how it looks. Keep it around and we’ll run it after script three too. You should see something like this;

Take a look at the Foreign keys based on our main table and the clustered/nonclustered indexes. Check script one for more details on these.

Script Three – This is our script that’s going to actually do the work. Have a read through this script to get an idea of what it does. You’ll notice that I’ve commented out the execution statements. This is so that you’ll see what it’s planning to do in your messages tab in SSMS before it actually does anything. If you run it, you should see something like this;

As you can see, what it wants to do is to drop the two foreign keys first, it’s then going to drop the existing clustered index on our PKTest_Main table. It’s then going to recreate the primary key as a clustered one, once it’s done this it’s going to reapply the foreign keys. In this instance it’s also not going to reapply our old clustered index as it’s not necessary.

If you copy these messages into a new window and run them (or un-comment the exec statements in script three) it’s going to actually make the alterations above. Once this is done, the results from script two should look something like this;

You’ll notice that our primary key is now clustered and the unnecessary previous clustered index is no longer present. Lovely.

In production (and in change control) you’d probably want this as a stored procedure with the pass through parameters as @TableName = ” and (probably) @Execute = 1 (default 0) so that you don’t fire this proc by accident.

Let me know your thoughts on this, if you’d make any changes to any of the scripts.

 

2 thoughts to “T-SQL 98 – My Technical Challenge – Clustered Index Conversion”

  1. Hi! I had this exact same problem but it was really bad- over 100 tables were heaps! Nice scripts btw. The only thing I would change is you should re-build the non-clustered indexes after adding or changing the clustered index. Well done!

    1. Thanks, yeah they took a little while to actually process on the number of tables we had to run this on. I did consider a rebuild but to the best of my knowledge they all get rebuilt as part of the new clustered index being created. Also, we have a index/statistics maintenance job that runs over night to clean up stuff like that which should catch any stragglers.

Leave a Reply