Large Object Data can be confusing for many DBAs. How to load it, retrieve it, and optimize queries on your server. New author and MVP Simon Sabin brings us a look at whether or not you should store this data in a row or on separate pages.
I have just received the latest copy of SQLMagazine and read with interest the article on "Varbinary(max) tames the BLOB" it discusses in passing the fact that the new max data types hold the data in the row (same page) if possible and only if the data is too large, move it off the page. This is the opposite to the older LOB data types, TEXT, NTEXT and IMAGE which only hold the data in row if you set an option. (Kalen Delaney goes into more depth in another article)
I have a word of caution around the inclusion of any of the max data types in a
table with other columns (except the PK). The reason for this is that because
the max data types are stored "in row", this means that if you have
to read other data that requires the row being accessed i.e. accessing a column
that is not in an index you will be reading this large column of data as well.
Imagine a person table
create table Person ( PersonId int identity(1,1) Primary Key, Forename varchar(50), Surname varchar(50), DateOfBirth datetime, Avatar varbinary(max)) Create Index IX_Person_DateOfBirth On Person(DateOfBirth)
Let's assume the average size of the Avatar is 4000 bytes (20x20 logo). If you then run the following query
Select Forename, Surname From Person Where DateOfBirth Between '1 Aug 2006' and '1 Sep 2006'
To get the Forename and Surname fields a bookmark lookup that has to go to the main row of the table. Unfortunately this is bloated with the Avatars and so you only get 1 row per page. So if the query returned 80 rows you will probably have had to read at least 80 pages of data.
If the Person table was set to have "large value types out of row" on then each page would contain about 80 rows because only a pointer to the Avatar is stored in the row (it would be > 80 if the forename and surname weren't always 50 characters). So your query would then read 2 or 3 pages (its > 1 due to the index pages being read)
There are other was around this by adding the extra columns onto your indexes by using the new "INCLUDE" clause, but this would have to be done on all indexes on your table.
My preference is to store images and other LOB data on tables with columns
that have a common selection policy, the LOB column (s) will always be selected
when the other columns are selected i.e. if in the above situation the Avatar
was always returned when any of the surname, forename or dateOfbirth
columns where selected then having the Text In Row would be a benefit.
So bottom line is to plan carefully whether to include LOB
data in row, taking note of the other columns on the table.
You can also read about the new .write
method available for the max data types that greatly improve update performance
Simon is a database architect for Totaljobs Group based in the UK. He specialises in performance SQL Server systems and recently on search technologies.
To keep up on Simons thoughts, read his blog
You can also read about Simon in a recent article by Steve Jones