Page Compression, Unique Identifiers and Speed

Question

Post reply

Page Compression, Unique Identifiers and Speed

dbakevlar

SSC-Addicted

Points: 488
More actions
August 6, 2025 at 5:00 am

#4631280

Comments posted to this topic are about the item Page Compression, Unique Identifiers and Speed

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

ako58 SSCarpal Tunnel Points: 4180 More actions · Answer 1

If you have a huge number of inserts and use Unique Identifiers, using NEWID() is better than IDENTITY

Thomas Franz Hall of Fame Points: 3957 More actions · Answer 2

The question is incorrectly formulated.

An INSERT on the end of a table (as with a clustered identity column) would usually just add new pages, while inserting into a table with a clustered unique idenetifier will often cause page splits and may reduce the fill factor to 50 percent in the worst case (every single page was splitted once) filling up slowly after that.

But when you have a *huge* number of paralell inserts the page allocation map can become a hotspot on an identity-index where it has to assigne hundreds or thousand new pages per second for different threads. In this case the uniqueidentifier index may become faster, since it could simply fill up the splitted pages and has only to assign new pages when the next page split occurs (and even in this case the load may be splitted across multiple pages of the page allocation map)

Sadly the second part of the question was about page compression and this can't be answered without knowing the data and data distribution.

When all old data where e.g. in City_1 and all new inserted in City_2, the IDENTITY index would compress better, since all City_1 are on the old and City_2 on the new pages. On the other hand when the city is randomly distributed, it wouldn't matter or could be random and even smaller (with a bit of luck).

Theoretical you could calculate in the datatype of the PK column, an INT (which is usually used for an IDENTITY ) would use 4 bytes, while an UNID needs 16 bytes that are unique and can't be compressed, so every row is at least 12 bytes longer / larger than with an index on a INT column.

God is real, unless declared integer.