SQL Server Non-Unique Clustered Indexes Are Evil!

Non-Unique Clustered Indexes Are Evil And We Have The Proof.

You won’t believe it when you see it.

Face it – I can’t write click-bait…

I was reading Grant Fritchey’s (Blog|@gfritchey) comments in the SQL Server Central forum thread “Advice on Creating Unique Clustered Index“, and was reminded of something I saw in Kalen Delaney’s Precon at SQLBits 11 (Nottingham).

Here’s how to prove the evil-ness of nonunique clustered indexes.

Preparing for the evil

Quick! Grab a SQL Server instance, and create a new table. Don’t worry, it’s tiny. At least, to start off…

CREATE TABLE ClusteredIndexCrash ( i TINYINT NOT NULL ) ;
CREATE CLUSTERED INDEX ClusteredIndexCrashIdx ON dbo.ClusteredIndexCrash( i ) ;

The theory behind clustered indexes is that they are (usually) unique – after all, they define the logical layout of your table on disk. And if you have multiple records with the same clustering index key, then which order would they be in? If you don’t define the CI as unique, then SQL Server will add (behind the scenes) a so-called “Uniqueifier” (or maybe “uniquifier”) to fix that. Grant’s first post in the thread referenced above gives some information about how to see this Uniqu[e]ifier in the table structure itself.

Add data

INSERT INTO dbo.ClusteredIndexCrash
SELECT TOP 1000000
    syscolumns c1 ,
    syscolumns c2 ,
    syscolumns c3 ;

1 million rows – that’s not going to be enough. Let’s add a few more, in batches.

INSERT INTO dbo.ClusteredIndexCrash
SELECT TOP 1000000
    syscolumns c1 ,
    syscolumns c2 ,
    syscolumns c3 ;
GO 2146

That is going to take a while. Fire up another window so you can run a different query to keep an eye on how far it has got.

    OBJECT_NAME(i.object_id) ,
    p.rows / 1000000 ,
    p.rows ,
    POWER(CONVERT(BIGINT, 2), 31) / 1000000
    sys.partitions p
    sys.indexes i ON i.index_id = p.index_id
                      AND i.object_id = p.object_id
    i.object_id = OBJECT_ID('dbo.ClusteredIndexCrash')
    AND i.index_id IN ( 0, 1 ) ;

Got to love the confusion caused by columns being named with function names…

When that query above has finished, assuming SSMS hasn’t crashed from the effort of displaying “1000000 rows affected” (or whatever) 2k+ times…

Generating The Evil

Try running the insert one more time.

INSERT INTO dbo.ClusteredIndexCrash
SELECT TOP 1000000
    syscolumns c1 ,
    syscolumns c2 ,
    syscolumns c3 ;

And now (finally) we get the error of the beast:

Msg 666, Level 16, State 2, Line 21
The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID <foo>Dropping and re-creating the index may resolve this; otherwise, use another clustering key.

What have we learned?

Apart from a table containing 2 billion tinyints takes up a lot of time and disk space?

Internally, the implementation of the uniquifier appears to be equivalent to an IDENTITY(1,1), just like your average DB designer would use to create an internal ID. However, because the range starts at 1, you lose half the range of numbers available to fit into this, limiting your uniquifier to 2.14-something billion rows. Again, just like an IDENTITY field, if you’ve deleted records from the middle of the table, you don’t automatically get to reuse those identities. Hence the suggestion that dropping and recreating the index may resolve this – by effectively removing the gaps and shoving everything back up again.

Rebuilding the index probably won’t have much effect, is just delaying the inevitable, and you should be rethinking your clustering strategy for this table.

Oh, and don’t do this on a production server. I mean, what were you thinking? Really? Intentionally generating massive tables with big errors in? Tsk.

This entry was posted in SQLServerPedia Syndication and tagged , , . Bookmark the permalink.

2 Responses to SQL Server Non-Unique Clustered Indexes Are Evil!

  1. denisgobo says:

    I decided to mess around with this… A truncate will reseed it as well (same as identity) oing deletes will not reuse the uniqifiers (also same as identity)
    See here http://sqlservercode.blogspot.com/2017/06/having-fun-with-maxed-out-uniqifiers-on.html

  2. Pingback: Make Those Clustered Indexes Unique – Curated SQL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.