Running a production SQL environment on a virtual server

  • We are running ESX server on Xeon boxes and had some issues virtualizing SQL Server. We had some configuration issues that were resolved that fix a lot of the issues with slowness. Once it was all configured correctly we still had issues with our high volume databases. It turns out that ESX server manages the access to the SAN interface by exclusively locking it so only one VM can use it at a time. When the number of write went way up we started having problems with SQL server "keeping up". We are migrating back to physical hardware except for testing and lab environments.

    My experience with virtualization is that you can more effectively use your resources but you have to spend time to manage the resources. We are constantly adding drive trays and RAM to the clusters and it takes time to move things around. The snapshot files do not follow the VMs so we have to move them manually. That might be an admin issue, but I am not the admin. When you have lots of users the admin is constantly managing the access to the resources and security. Every view has a different set of security attributes so it makes it more to manage. We took out a whole bunch of P3 machines and replaced them with 64 bit bixes. Even with free products you can't do a lot with 4GB of RAM when you are running Windows and the VMs have things like SQL Server installed on them. You can get P4 machines with chipsets that support more RAM but most of the desktops with inexpensive motherboards do not support it. With ESX server you can use the hardware virtualization but I don't think that VMWare Server supports it. The same issue exists with the desktop machines supporting it too. They need a BIOS, chipset and processor that supports it and most inexpensive motherboards do not.

  • I'm not well versed in vm, but it seems to me you lose the ability to control the physical location of important database components like log files and the ability to seperate this from database files. As any good dba knows 😉 the slowest part of any database is the arm on the hard drive....I would be very skiddish about running a production sql server on vmware....especially a high transaction db....

  • We use our SAN for spreading out some of the files. So some limitations may also depend on other network infrastructure you can leverage.

    Lots of different things come into play. And what may look good on paper still needs to pass some real life load testing. Maybe starting out using development / test environments would on VM would be wise before taking a leap.

    One thing that seems common - your mileage may vary.

    Greg E

  • rgriffin (1/30/2008)


    I'm not well versed in vm, but it seems to me you lose the ability to control the physical location of important database components like log files and the ability to seperate this from database files. As any good dba knows 😉 the slowest part of any database is the arm on the hard drive....I would be very skiddish about running a production sql server on vmware....especially a high transaction db....

    No more so than on a SAN you don't control. You can build out LUNs and the like to segment the files on separate spindles. However, except in the case of lightweight DB access, we don't use SQL Servers on VM in production, even with that possibility for the architecture.

    K. Brian Kelley
    @kbriankelley

  • Isses can be addressed by adding fiber interfaces for the SAN and targeting specific trays. VMWare lets you use physical drives or you can put your drive file on different physical drives to isolate contention. It would look like multiple physical drives to the VM and SQL could put its diles on them. Other VMs don't need to know about them. All of these approaches take away a level of resource sharing and gives the VM more control over its world. Typically it is faster to allow the SAN to abstract the storage and let SQL believe that it is one big drive. The problem that we had was not with the bandwidth avaliable on the fiber interface. It was with the availability of the interface itself because ESX Server was locking it. So we saw plenty of bandwidth available but long queue lengths.

  • An additional comment on this one:

    Virtualization is not necessarily used just to "slice up" an underutilized machine, it can also be used to break the dependency on physical hardware allowing for faster recovery in the case of a physical machine/other infrastructure component failure.

    For example, VMotion (a VMWare product) is capable of "moving" or adding additional servers of a specific type to a cluster of machines on the fly/on demand.

    Physical Server A blows a CPU, no problem, VMotion can move the the affected VM to Server B (or C or D) on the fly with a downtime (if any) measured in seconds (not hours).

    Admittedly, this type of virtualization requires a LOT of underlying infrastructure but the performance hit of running a virtualized rather than a physical environment can be offset by the flexibility that a virtualized environment provides.

    The performance hit of running virtualized has also been dramatically reduced by the introduction of hypervisors, etc. that remove a base O/S from the equation - in the "old days" (about a year ago) it wasn't unusual to see the "host" O/S consume 20-30% of the available resources, with current technologies that start at "bare metal" the performance hit is often <=10%. Throw in the current generation of blade servers/chassis that have multiple dedicated HBA's and PCI busses and woooo hooo! you can do some amazing stuff.

    Joe

    P.S. No commercial relationship with VMWare (not a shill), I've just seen their stuff do some amazing things. MS Virtual server can't do it yet, but just wait...

  • I have not much exposure of VMWare.

    But if we think logically I won't perfer to run production on VMWare on follwoing ground:

    1. It will share the resources like CPU, MEMORY, DISK etc.

    2. Now if you are trying to set sql server you can't save cost on DISK.

    3. If CPU & MEMORY shared by the other application than we have to compromise on the performance.

    4. OS has to run SQL Server via VMWare instead of running it direct on OS.

    5. If VMware fails SQL Server fails though OS is healthy.

    Hope this will help.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • free_mascot (1/31/2008)


    I have not much exposure of VMWare.

    But if we think logically I won't perfer to run production on VMWare on follwoing ground:

    1. It will share the resources like CPU, MEMORY, DISK etc.

    2. Now if you are trying to set sql server you can't save cost on DISK.

    3. If CPU & MEMORY shared by the other application than we have to compromise on the performance.

    4. OS has to run SQL Server via VMWare instead of running it direct on OS.

    5. If VMware fails SQL Server fails though OS is healthy.

    Hope this will help.

    I think you need more exposure to ESX Servers there is so much you can do to alleviate these issues.

    1. If incorrectly configured then yes, but why would you configure it to do this? You can set which Virtual servers use which resources and when done correctly for a production environment especially a SQL server you wouldn't share resources.

    2. ?

    3. Only compromise if configured correctly is underlining linux platform.

    4. SQL server doesn't know its being run on ESX server and doesn't care.

    5. Believe me Windows is x100 more likely to crash than linux. Never has the ESX platform crashed on me in 2 years on using it in large environment, the management console does crash through - this runs as a windows service 🙂

    Running SQL server on an ESX server instead of straight on hardware does an overhead of roughly 5-8% CPU and 5-10% RAM, the problem you have is if virtual servers are incorrectly configured and are sharing CPU cores - ESX will need to assign this resource adhoc and it is this switching of resource which would become noticeable when using SQL server.

    I think speaking to ESX experts on VMWare forums instead would give you better idea of the pro/cons (which there definitely are). But i think answer is yes running SQL server on ESX is in production environment is doable but.......

  • 3. Only compromise if configured correctly is underlining linux platform.

    ...

    5. Believe me Windows is x100 more likely to crash than linux. Never has the ESX platform crashed on me in 2 years on using it in large environment, the management console does crash through - this runs as a windows service 🙂

    Running SQL server on an ESX server instead of straight on hardware does an overhead of roughly 5-8% CPU and 5-10% RAM, the problem you have is if virtual servers are incorrectly configured and are sharing CPU cores - ESX will need to assign this resource adhoc and it is this switching of resource which would become noticeable when using SQL server.

    I think speaking to ESX experts on VMWare forums instead would give you better idea of the pro/cons (which there definitely are). But i think answer is yes running SQL server on ESX is in production environment is doable but.......

    Keep in mind that ESX is an extremely pared down Red Hat linux installation. Which means that there are some things which will affect most Red Hat installations which won't affect ESX because ESX didn't include that component. And as far as the crashing possibility is concerned, we've seen 1 crash, but it looks to be hardware related. We also saw a similar crash on a Windows platform within about two weeks on a server that was in the same hardware batch. We generally don't worry about our ESX servers because they are extremely stable.

    And as far as SQL Server in production, we have seen the throughput that has made us say, "No, got to go physical." Some of the I/O issues are already spoken of. We have deployed small instances on VMs, but there's usually not an overriding performance concern.

    And as far as recovery, in our case, SQL Servers onto the same physical hardware in a recovery situation tends not to be a big deal. If you have the logins scripted, you can take app databases and restore them, run the script to regenerate the logins, and then use aliases on the requisite client computers and you're good (cliconfg is your friend). We don't get the advantage of Vmotion and HA that's present in ESX 3.0, but in cases where we have to have HA, we tend to stick with clusters.

    K. Brian Kelley
    @kbriankelley

  • if you have to go to such pains to put SQL on VMWare what is the point? why not just buy physical servers and you don't have to worry about paying for VMWare licenses.

    only reason i see is if you are a SQL hosting provider and need a cheapo solution to host multiple SQL installations

    we have used ESX for years now with VMotion and the works and it works great for things like weblogic instances where in the past you would have had to buy a new server to run some new app. and it's better than blades since you get clustering as well. but i don't see the point in putting Exchange and SQL on it since you have to go through so much configuration and dedicating hardware, you might as well just buy new hardware

  • SQL Noob (1/31/2008)


    we have used ESX for years now with VMotion and the works and it works great for things like weblogic instances where in the past you would have had to buy a new server to run some new app. and it's better than blades since you get clustering as well. but i don't see the point in putting Exchange and SQL on it since you have to go through so much configuration and dedicating hardware, you might as well just buy new hardware

    I have seen folks run Exchange on VMs. My opinion on that is the same. But the main reason is they can do Vmotion in case of hardware issues, and with ESX 3.0, HA will automate that. So the recovery time tends to be better than with a cluster failover and is easier on the clients. Then again, PolyServe has similar type cluster offerings.

    K. Brian Kelley
    @kbriankelley

  • I think the normal concept of Virtual Machines, ie VMWare running on a server with multiple CPU's, lots of memory and maybe a RAID array is not terribly suited to highly transactional SQL Server instances as it will be competing for I/O resources. Thats not to say it wont work for other scenarios and I know 1st hand that it does. Just so long as the resources allocated to the VM dont outweight those requried from the SQL Server instance.

    That being said, environments based around blade servers and SAN storage, when scaled properly, ie the blade running SQL Server has an appropriate CPU, memory and disk bandwidth allocation then there is nothing to stop it working.

  • SAP is now certified to run on VMWare even for production environments. If it is good enough for them it should be good enough for most things!

    http://www.vmware.com/company/news/releases/sap_fullsupport.html

    We have quite a few servers (over 100 SQL and non-SQL) running on VMWare and will be moving many more over the next 12 months.

  • K. Brian Kelley (1/31/2008)


    SQL Noob (1/31/2008)


    we have used ESX for years now with VMotion and the works and it works great for things like weblogic instances where in the past you would have had to buy a new server to run some new app. and it's better than blades since you get clustering as well. but i don't see the point in putting Exchange and SQL on it since you have to go through so much configuration and dedicating hardware, you might as well just buy new hardware

    I have seen folks run Exchange on VMs. My opinion on that is the same. But the main reason is they can do Vmotion in case of hardware issues, and with ESX 3.0, HA will automate that. So the recovery time tends to be better than with a cluster failover and is easier on the clients. Then again, PolyServe has similar type cluster offerings.

    how large were the implementations?

  • We can actaully install multiple low-bandwidth database servers on a single host for less money using VMWare. With the version of Enterprise that we buy we can install as many instances as we want on the physical machine. With SQL standard we can install as many instances as we have physical processors (not cores).

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply