SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
Search:  
 
 

SQL Man of Mystery

Add to Technorati Favorites Add to Google
September 2009 - Posts

SQL Server MVP Deep Dives

By Wesley Brown in SQL Man of Mystery 09-29-2009 11:00 AM | Categories:
Rating: (not yet rated) Rate this |  Discuss | 1,334 Reads | 836 Reads in Last 30 Days |1 comment(s)

I’ve been reading through this book and it really does have something for everyone at any level.

I think it is truly awesome that this many people have stepped up and donated not only their time but their knowledge to help http://www.WarChild.org!

I’ll have a more in-depth review later this week when I’ve read it cover to cover.

From Paul Nielson’s blog:

http://sqlblog.com/blogs/paul_nielsen/archive/2009/09/29/53-mvps-warchild-org-and-sqlservermvpdeepdives-com.aspx

GO GET IT! :)

-wes


Fundamentals of Storage Systems - The Basics of Spinning Disks

By Wesley Brown in SQL Man of Mystery 09-17-2009 10:40 PM | Categories:
Rating: |  Discuss | 3,358 Reads | 1025 Reads in Last 30 Days |15 comment(s)

 

Your servers are only as fast as the slowest part, hard drives.To feed other parts of the system we have to add lots of drives to get the desired IO single server can consume.

The basics of how hard drives work has been fundamentally static since the 70’s only refinements in technique and the core technologies have improved. You have a shaft or “spindle” attached to a motor. Disks or "platters” are attached to the spindle. The motor spins the spindle and the platters. Read/write heads controlled by actuator motors move across the surface with very precise motion and access the information stored on the platters. Generally, there is one read/write head per platter surface that is useable.

Simple.

This configuration has worked so well for the last 45 years that every claim to date that X new technology would unseat it just hasn’t happened. That’s not to say it won’t happen, just that hard drives have been “good enough” for the bulk of our storage needs for a very, very long time.

Since this is the core of our permanent storage in our database world it is important to have a basic understanding of them.

File:SixHardDriveFormFactors.jpg

Description
Six hard disk drives with cases opened showing platters and heads; 8, 5.25, 3.5, 2.5, 1.8, and 1 inch disk diameters are represented.
Date
1 March 2008(2008-03-01)
Author
Paul R. Potts

http://commons.wikimedia.org/wiki/File:SixHardDriveFormFactors.jpg

I love this picture. Smaller and faster yet still the same.

To give you an idea of what you are up against lets compare the growth rate of your hard drive VS. your CPU.

Our 1981 machine has a the veteran Seagate ST-412 and a Intel 8088.
Our new computer has a Seagate Cheetah 15k.6 ST3146356SS and a Core i7 965 from Intel of course.

Time Circa 1981 Today Improvement
Capacity 10MB 1470MB 147x (209715x for 2TB drive)
Seek speed 85ms 3.4ms 20x (6x for 2TB drive)
IO/Sec 11.4 303 26x
Mbit/Sec 5 (0.625 MB/Sec) 1000 (125 MB/Sec) 200x
CPU 4.77Mhz(.33 MIPS) 3200Mhz(18322 MIPS) 5521x

At first glance we can say WOW what an improvement! Right up until you see how far the processors have come.Everyone is familiar with Moore’s law (Often quoted, rarely understood) loosely applied says CPU transistor counts double roughly every 18 to 24 months.Up until recently, hard drive capacity has been growing almost at the same rate doubling in size around every 18 months (Kryder’s Law).

Hard disks haven’t come close to keeping up with that pace, performance wise. Again, the problem isn’t size is speed.

 

The Makeup of A Modern Hard Drive

You cannot change the laws of physics” – Scotty

As I stated in the previous section hard drives have remained relatively unchanged since the IBM Winchester drive. Lets take a closer look at the physical structure.

Head, Sectors and Cylinders

So, we have a spindle one or more platters and one or more read/write head, all of that spinning and jittering about at a pretty good clip. So, just how does the computer know where your data is? The platter is broken up into a map of sorts.

Simplistic view:

image

The platter is broken up into concentric rings and pie slices that allow the drive controller to find the region where the data is.

image

The heads all move in unison and present a view through the platters that make up a cylinder. I won’t go into great detail on how we have advanced sector and track layouts and the advent of Logical Block Addressing there are plenty of articles on the web that get into those nuts and bolts. What I’m after is to show you physically what has to happen to read the data from the disk and why that is the limiting factor.

With the disk spinning at 15,000 RPM the sectors are flying by pretty quickly so the head has to be positioned above the sector and then read or write to it as the platter moves underneath it. The spinning disk, moving the heads and waiting for the data to be read all add up to latency.

Rotational latency is how long it takes the sector we are after to move under the head to be read or written to. Average rotational latency is expressed as half the time it takes for the platter to make one revolution.

For our 15k hard drive that number is 2 milliseconds, 60 seconds divided by 15000 RPM divided by 2.

Seek Time is how quickly the disk head can be positioned over a sector to start reading data.

There are to kinds of seek we are interested in, average random seek time and sequential or track-to-track seek times.

In our top of the line Seagate Cheetah our random read seek time is 3.4ms that is the time it takes to get from any one sector to any other sector, usually half the distance from the inner track to the outer most track. Random write seek time is 3.9ms. It is longer due to the process of actually effecting the sector its at before moving on to the next random sector.

sequential is much much faster. If the head only has to move to the next track it can usually do so in under a millisecond.

All this adds up to an average access time. basically, you take the rotational latency plus the average random seek time and any command processing time overhead I usually throw in an additional millisecond. Our Cheetah has a random access time of 6.4ms. Sometimes it may be much faster sometimes it may be much slower but this is a good number to work with as far as planning our storage needs.

The flip side of operations per second is throughput usually expressed in megabytes a second.

This is a direct correlation to the amount of data that can be squeezed into a sector. As drive densities go up so does the average megabytes per second. There is something you should know, the inner tracks are slower on throughput but higher on IO’s and the outer tracks are higher on through put and lower on IO’s. This is just a function of the diameter of the platter getting larger the farther out you go.

It isn’t unusual to see sequential throughput average of around 110 MB/sec and that is only getting better.

Random throughput is not so rosy a picture. I haven’t seen any drive manufactures advertise these numbers from my own testing it can be as little as 15MB/sec up to 40MB/sec. You should test your system to get more accurate numbers.

What It All Means to Us

This boils down to how many I/O operations a single disk can give us. In SQL Server land random IO is king and generally one of the biggest bottlenecks on our data files.For log files, things are a little better. Since logs are written to sequentially you can effectively double the available I/O’s a drive can provide since you have cut our the random access and are much closer to the sequential or track-to-track access.

To calculate the maximum number of random operations we use 1000ms / (seek time[ms] + latency[ms]+overhead[ms])= input/output operations per second.

or

1000/(3.4+2+1) = 155 IOps

Sequential reads get much better since seek times go down from 3.4 to around 0.2.

1000(.7+2+1) = 270 IOps

Almost twice as much! Now you know why we keep our database log files separate from each other and from the data. The amount of disks needed to get the performance is about half. We do the same thing for writes and they will be a little less.

Hard drives suffer from what is known as the “hockey stick” effect the closer they get to 100% utilization the performance falls off dramatically.

image

Since running a disk at 100% capacity for IO’s introduces the maximum possible latency. The knee of the curve is around 80% we back that off a little more to 75% and that gives us the number of IO’s we have available per hard drive in the storage system in general. This reduces Queuing and keeps latency low, at the cost of maximum number of IO’s.

Now our available read IO’s is down to about 117 IOps for random access and 216 IOps for sequential.

This number will get better as seek times get better and the command overhead gets better. But remember it will never ever be better than the 2.0ms for the rotational latency.

Physics can be a real bummer sometimes.

Along with physical spindle speed there have been large improvements with how the drive handles incoming and outgoing request. Through IO Prioritization and advanced command queuing algorithms (Native Command Queuing on SAS/SATA) access times and latencies are kept predictable and as fast as possible.

See you next time…

Now that you are armed with the fundamentals of how a hard drive works we can move on to disk controllers and RAID.

Stay tuned!


The Fundamentals of Storage Systems - Introduction

By Wesley Brown in SQL Man of Mystery 09-14-2009 10:49 PM | Categories:
Rating: (not yet rated) Rate this |  Discuss | 1,736 Reads | 826 Reads in Last 30 Days |5 comment(s)

At least once a year I give a large talk on disk subsystems, IO and SQL Server. It’s a ground up from the nuts and bolts of how a hard drive works through SAN’s and Solid State Disks.

The reasons I give this presentation so often is it is one of the most requested topics and one of the most misunderstood.

The problem often lies in the fact the DBA may not know that much about different storage systems but they do know that it is very important do their jobs.

With the rise of SAN, iSCSI and other storage solutions DBA’s have less and less control over the disk system that their SQL Server relies on. It’s my goal to give them, or you, the tools they need to effectively present their needs to the storage teams hopefully without a major amount of fuss and arguments. If you know how and why it works they way it works you can make logical requests in the language that your storage folks understand.

To that end, the presentation is meant to lay the foundation that can then be built upon and expand your knowledge off all things I/O.

This article series will be slightly expanded over what my presentation normally covers, since I’m only restricted by your willingness to read what I write.
It will still be a condensed version of storage systems but I’ll put up as many reference links as I can.

This series will cover:

The basics of spinning disks.
Disk Controllers.
RAID.
Storage Area Networks
Network Attached Storage/iSCSI
Solid State Disks.
SQL Server and The File System.
Understanding Mean Time to Failure and Other Failure Metrics.
Tools and Techniques To Monitor SQL Server and I/O.

Some topics may be a single post some may span several I won’t know for sure until I get done writing them. As request come in I may try to post on specific questions, or at a minimum point you in the right direction.

Stay Tuned….

-Wes


Meeting Sept 15th at the Microsoft Technology Center - Auditing Considerations: Native SQL 2008 and Third Party Auditing Technologies.15

By Wesley Brown in SQL Man of Mystery 09-12-2009 1:45 PM | Categories:
Rating: (not yet rated) Rate this |  Discuss | 1,639 Reads | 839 Reads in Last 30 Days |no comments

Hello all!

We will be having our normal meeting at the Microsoft technology center sponsored by:

clip_image001

http://www.cactuss.org/Supporters/tabid/59/Default.aspx

Microsoft http://www.cactuss.org/MeetingInformation/tabid/63/Default.aspx
Stonebridge Plaza, Building One
9606 N. Mopac Expressway, Suite 200
Austin, TX 78759

5:30PM to 6:00PM – Q&A and DBA 101 presentation
6:10PM to 7:00PM – Feature presentation
7:00PM 7:30PM – Q&A follow up and discuss next meeting.

The feature presentation will be given by  Bjorn Hovd, Systems Engineer, Idera over Auditing Considerations: Native SQL 2008 and Third Party Auditing Technologies.  As usual we will have some goodies to give away and food for everyone!

See you there!


What I’ve Read and Recommend to Others – Joe Celko Edition

By Wesley Brown in SQL Man of Mystery 09-12-2009 12:00 AM | Categories: Filed under: , ,
Rating: |  Discuss | 3,728 Reads | 981 Reads in Last 30 Days |13 comment(s)

I’ve known Joe for a number of years and have a lot of respect for his experience and knowledge around relational database design and the SQL language.

Joe is a prolific writer and has been writing about technology since I was in grade school with articles going back into the early 80’s. Not to mention his ten years serving on the ANSI board for SQL standards.

I would put Joe’s books between the purely academic text on relational and set based theory and the more popular books out today that don’t always cover some of the dryer materials that newcomers to SQL may find to hard to digest at the start of their career.

If you have been working with SQL for a while and want to take the next step Joe’s books are generally the way to go.

Joe’s style is humorous at times and completely unflinching at others. When it comes to things that he thinks is the right way to model and develop using the SQL language, and that is against what popular or easy methods call for, he makes his feelings known on the subject.

 

Joe Celko's SQL for Smarties: Advanced SQL Programming Third Edition (The Morgan Kaufmann Series in Data Management Systems)

Now in its third edition this book covers, in detail and depth, what many other texts leave behind. This isn’t a beginners book. This isn’t a text for the dabbler in SQL.

Joe suggest at least a years worth of experience and I would qualify that as a years worth of solid 40 hour a week kind of experience. If you have mastered third normal form and want to take it to the next level this will be one of the text’s I’ll always point you to.

For those who are squeamish about a little math, get over it. I hear people say relational database work isn’t math and they are flat wrong. Just because you may not understand the math doesn’t mean it doesn’t govern every aspect of your relational world.

If you want to take a real peak behind the curtain of relational theory and cover some advanced data modeling this is as good as it gets.

 

Joe Celko's SQL Puzzles and Answers, Second Edition, Second Edition (The Morgan Kaufmann Series in Data Management Systems)

I liked this book because it allows you to see how many different ways there are to solve the same problem using SQL. Some of them are very interesting in the approach. It will show you how to think in other ways through the eyes of others. Any time you can get a look into how others solve issues you only build up your own problem solving skills. Plus, for a SQL geek like me I like taking the Pepsi challenge and see how I stack up.

 

Joe Celko's Trees and Hierarchies in SQL for Smarties, (The Morgan Kaufmann Series in Data Management Systems)

This is one of my favorite books that Joe has ever written. It covers one of the more complicated modeling issues you will come across, and come across regularly. It explains clearly trees and hierarchies, how they are the same and how they are different. Now that SQL Server 2008 has a hierarchy function don’t think it does away with this book! Joe covers lots of different methods and what works best in different situations.

 

Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL (The Morgan Kaufmann Series in Data Management Systems)

Joe takes one of the common issues new, and some intermediate, users of SQL have a hard time overcoming, sets. I would say most of us that have a background in SQL probably came from a traditional programming back ground and that means iterative thinking. Loops, lots and lots of loops. Having come from that kind of background I still consider the day I finally got functional over procedural was the day I really started down the road of being a SQL developer. 

This book can be a heard read if you are locked into procedural mindset, but if you are trying to break out and really get past the same old mistakes we all make this will help you along your path.

 

Joe Celko's SQL Programming Style (The Morgan Kaufmann Series in Data Management Systems)

Here we are again, Joe tackles a subject that gets glossed over in many other texts if it gets touched on at all. One of the keys of this book is helping you develop a consistent and predictable way to get the data in and out of your system. Even though this book is on style part of that is understanding set based architecture.  Even if you understand the math behind normalization that doesn’t mean you understand the data that goes into it. Even if you don’t agree with everything in this book it will help you focus on the data, consistency and improve you as a SQL developer.

 

Joe Celko's Analytics and OLAP in SQL (The Morgan Kaufmann Series in Data Management Systems)

The concept of this particular work is to help transition the online high transaction, high volume database developer over to the analytical side of data aggregation and warehousing. Some folks think because they understand databases in an OLTP environment they can move easily in to OLAP. I’m here to testify that isn’t as easy as it sounds. Joe covers the concepts and some of the newer SQL syntax available in the ANSI-99 standard. Not all of it is available in SQL Server but it is a solid introduction to data warehousing and how to put your general SQL skills to use in the OLAP world.
This book is based on materials in SQL for Smarties. It expands on some of the topics covered there. 

 

I hope you enjoy these books as much as I have, over the years I have relied on Joe and others like him to build my own knowledge base and skills. I also have tried to share that knowledge like Joe has for so many years to others who want it.


Product Review: SQL Pretty Printer

By Wesley Brown in SQL Man of Mystery 09-05-2009 10:29 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 1,879 Reads | 869 Reads in Last 30 Days |3 comment(s)

SQL Pretty Printer for SQL Server Management Studio $39.95 single user $99.95 site license

having coding standards is a must for solid quality code. There are lots of articles on the subject like this one on simple talk.

Having standards is one thing, getting everyone on board is another, that’s where code formatters or beautifiers come in.

I’ve used code beautifiers for may other languages over the years and have written a couple to help enforce coding standards.

One of my pet peeves working with the default tools that ship with SQL Server is the lack of any kind of code formatter.

When you work with more than one developer you will get code that is ether hard to read or formatted to their specific taste.

That by its self isn’t a huge issue but when you are dealing with hundreds or thousands of stored procedures and other code bits written in T-SQL that you now have to dig into in can be a time sink. This is only compounded in an emergency where you are having to quickly look at a piece of code and figure out what exactly is going wrong.

To add insult to injury SQL Server will also help mangle the code for you and good. Heaven forbid you should script out a complicated view and watch about 400 lines of code squeezed into say 10.

There are some options out there to help with this, I had one criterion that had to be met though, it must integrate with SQL Server Management Studio. This one requirement stripped down the number of candidates very quickly.

If you need to format a smallish piece of code there are several online formatters, the authors of SQL Pretty Printer have one at http://www.dpriver.com/pp/sqlformat.htm.

So, with that requirement I found a couple at the time that met the requirements Red Gate SQL Refactor and you guessed it SQL Pretty Printer. SQL Refactor offers a lot more than code formatting and has one feature I really liked that expands wild cards into column listings, but they want $369.00 for it, per user. I also had a problem that after the trial expired I uninstalled and it left grayed out menu items in SSMS that now just annoy me, and I don’t know what to do other than a reinstall of SSMS to fix it.

After looking at SQL Refactor I loaded up SQL Pretty Printer.

It doesn’t offer some of SQL Refactor’s options but it does reformat the code, it also will convert your statements to work with C# and VB.Net. It will also grab sql statements out of those two languages and put them back as standard T-SQL.

After using it for a couple of weeks I really grew to like it, it was quick and easy CTRL-K-CTRL-H and *POOF* code was formatted, well most of the time we will get to that.

It has several customizations that control the formatting its pretty straight forward through the options dialog.

SQLPrettyPrint1

With the preview window its pretty quick to get the formatting the way you like it.

The things I like about SQL Pretty Printer:

Simple install and direct integration into SSMS/VS. This is just a huge time saver for me, there is a stand alone client that has more features but using SSMS all day not having to flip applications, cut and paste to format is a real win.

Fire and forget. Once you have it setup it just blends into the system a keystroke and everything is formatted, or just the text you select.

Inexpensive, Since the site licenses is low enough getting it installed on every developers machine isn’t a barrier giving everyone the ability to use the same code formatting standards.

The issues I have:

Formatting fails to render. Sometimes a block of code I have selected doesn’t format, instead it disappears. Yeah, it sounds bad but the CTRL-Z undo works so I haven’t lost anything. I will be submitting a bug report.

Unparseable code doesn’t format.If it isn’t a valid SQL statement it won’t attempt to format it. Not a huge deal, but I like to format as I go if possible.

Needs more formatting options. It’s hard to please everyone but you can start with me :). I would like to see more options for code style how things rap and split lines.

 

It’s simplicity and affordability make it a win in my book. If someone comes up with a better formatter at a cheaper price I’d love to see it.

 

Wes