TechEd - Day 3

, 2008-06-12

I left my laptop at the Red Gate booth, and then managed to wander up to a Data Compression talk from Sunil A. from Microsoft. It was a great presentation, and despite some extensive reading and researching earlier this year, I learned quite a bit about how it works. There are definitely changes from earlier documentation and Sunil had some great slides, so I'll try to write more over time.

A few quick things:

  • Row compression is moving to a variable storage format. This is a low level of compression and it provides some benefits, with low overhead. Nulls and 0s use almost no space, with them being encoded and using less than a byte if there's present.
  • Indexes non-leaf pages are only row compressed. You can set leaf pages to be row or page compressed.
  • Page compression has two parts: prefix and dictionary.
  • Prefix compression actually doesn't use the shortest matching value, but the longest. In the example Sunil gave, if you have three values (Smith, Smith, Smithson), the value stored in the header is (Smithson) and the tokens actually determine how many characters to replace. So the tokens here would be (5, 5, 8) meaning use the first 5 characters (Smith) for the first two values.
  • Prefix compression is only for a column. The values and tokens are not shared across columns.
  • Dictionary compression matches entire values, so only those entire matching values on the page are replaced. Using an example of three more values (Steve, Steve, Steven), only the 2 Steves would be replaced with a dictionary entry.
  • The compression techniques are byte agnostic, so 0x65656565 could be an integer value or character (5 As) and the compression doesn't care. It looks at byte patterns, not values.
  • Compression space savings make sense if the table or index is a relatively large percentage of the database. If you have a small part of the database, or not used that much, the savings might not really pay off.
  • You compress object by object, so you make the decisions for each table, AND EACH index. It's not all indexes that you consider, but each one separately.

More to come later.

I also was on another panel today, talking about the new features of SQL Server. The configuration server, the new central server that can store common registrations for all DBAs also can help deploy PBM. I didn't know that.

The resource governer is cool as well and I saw a few interesting tidbits on that. Like the CPU changes are dynamic, meaning on the fly, and that you might want to keep a small Admin pool setup that gets at least 5% and no more than 20%.

I'm tired. It's been a long week. 





Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.


1,567 reads

Networking - Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...


1,530 reads

Speaking at Community Events - More Thoughts

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.


360 reads