Dealing with large tables (Financial transactions)

  • Sure,

    Currently we have a machine where we have windows, sql server, system databases and production database(data file only) on C drive (i know, don't need to comment on that) which is an SSD (300GB), log files are placed on the D drive (2TB) which is a standard drive. sql server has 58GB of RAM available, about 2GB free left for system and 8core CPU. With current provider we are not able to add another drive neither extend any existing drives.

  • I would want to look at the 2TB drive, to see

    a) how much space is used by logs

    b) what else is on the drive

    Then I would want to look at moving the non-log files to an external drive, and swapping the 2TB spinning rust for a SSD. The external drive can either be plugged directly into the server, or be a standard Windows network share if both the DB server and share host are running W2012 or above.

    This will not solve all your performance problems, but should be a low cost mid-life booster until you can get a better server. Putting the logs on to SSD would improve anything that needs to log, such as inserts and index rebuilds.

    If your database backups are currently on C or D, then try to get these moved to an external drive - you do not need the IO for these taking performance away from your main DBs.

    If you get your logs on to an SSD, then monitor IO rates for both drives. If the IO is not evenly distributed, then I would consider moving tempdb data files to the log drive. Even though this would mean tempdb data and log is on the same drive, because tempdb is recreated at each SQL startup there are no integrity issues doing this.

    These ideas are just sticking plaster, and not necessarily what should be done for a new server build. They are aimed at making minimum changes to give the biggest benefit.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • EdVassie (3/17/2016)


    I would want to look at the 2TB drive, to see

    a) how much space is used by logs

    b) what else is on the drive

    Then I would want to look at moving the non-log files to an external drive, and swapping the 2TB spinning rust for a SSD. The external drive can either be plugged directly into the server, or be a standard Windows network share if both the DB server and share host are running W2012 or above.

    This will not solve all your performance problems, but should be a low cost mid-life booster until you can get a better server. Putting the logs on to SSD would improve anything that needs to log, such as inserts and index rebuilds.

    If your database backups are currently on C or D, then try to get these moved to an external drive - you do not need the IO for these taking performance away from your main DBs.

    If you get your logs on to an SSD, then monitor IO rates for both drives. If the IO is not evenly distributed, then I would consider moving tempdb data files to the log drive. Even though this would mean tempdb data and log is on the same drive, because tempdb is recreated at each SQL startup there are no integrity issues doing this.

    These ideas are just sticking plaster, and not necessarily what should be done for a new server build. They are aimed at making minimum changes to give the biggest benefit.

    The log file is fairly small (about 10GB but in average there is only like 1-3% used at a time), as the transaction log is backed up every 5 min, beside log files, the drive is used for storing backups before copying them to another server. I've already tried to get another drive from our provider but there is no way they can get us anything better than we currently have. But generally is it better to have data files on a slower drive and log files on SSD (I assume it depends)? Might be worth mentioning that there is asynchronous mirroring in place. Additionally please have a look at attached top wait stats

  • DesNorton (3/15/2016)


    Emil B (3/14/2016)


    ... the table has about 90mlns of rows with 20 GB data space and 26 GB index space ...

    The cleverer peepels here will probably shoot me down on this. BUT, perhaps part of the problem is in the indexing strategy. When the indexes take up more space than the table, it could be that there are overlapping indexes (or lots of includes) that could be merged. With the correct clustered index, most, if not all, includes can be eliminated, thereby reducing the size and number of required indexes.

    I can't shoot you down for that because I agree for the most part. But, depending on how wide the table actually is and the nature of the queries, the Clustered Index may or may not be of much use for performance. Much narrower NCIs may be the rule for the day and some good care should be taken to not have much overlap, as you say.

    I've also found that row lookups aren't always to be avoided. Sometimes a lookup on a very narrow index followed by a lookup blows the doors off everything except and INCLUDEd covering index but without the obvious duplication of data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • But generally is it better to have data files on a slower drive and log files on SSD (I assume it depends)? Might be worth mentioning that there is asynchronous mirroring in place. Additionally please have a look at attached top wait stats

    1) You have it backwards. It is usually best to have logs on rotating media because tlog activity is mostly sequential, which isn't that bad on rotating disks. HOWEVER, if you are like most of my clients you have your disks so dang carved up and so many files (or file fragments) on them you never, EVER actually get sequential IO.

    2) WRITELOG waits are high. You NEVER want things to wait on completing a transaction, because that can mean that people can't buy your widgets from your website and they will go shop elsewhere. I note these COULD be from batch activities in which case as long as you are meeting your processing windows and not receiving other bad side effects then no one cares. The mirroring is clearly affected too a smaller degree.

    3) I am guessing you have default settings for Cost Threshold for Parallelism and MAXDOP? Likely WAY too much parallelism going on noting the CXPACKET waits.

    4) I highly recommend you consider getting a professional in for a performance review.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (3/17/2016)


    But generally is it better to have data files on a slower drive and log files on SSD (I assume it depends)? Might be worth mentioning that there is asynchronous mirroring in place. Additionally please have a look at attached top wait stats

    1) You have it backwards. It is usually best to have logs on rotating media because tlog activity is mostly sequential, which isn't that bad on rotating disks. HOWEVER, if you are like most of my clients you have your disks so dang carved up and so many files (or file fragments) on them you never, EVER actually get sequential IO.

    2) WRITELOG waits are high. You NEVER want things to wait on completing a transaction, because that can mean that people can't buy your widgets from your website and they will go shop elsewhere. I note these COULD be from batch activities in which case as long as you are meeting your processing windows and not receiving other bad side effects then no one cares. The mirroring is clearly affected too a smaller degree.

    3) I am guessing you have default settings for Cost Threshold for Parallelism and MAXDOP? Likely WAY too much parallelism going on noting the CXPACKET waits.

    4) I highly recommend you consider getting a professional in for a performance review.

    I'm aware of all those things

    1) this wasn't a statement but rather a question based on what EdVassie said about moving logs to SSD.

    2) we have one rotating drive that is used for few more things other than logging transactions, it is not because i want to do it but i have to, therefore because of our hardware limitations there wont be sequential IO

    3) I've set maxdop to what is recommended by microsoft HERE and I've increased our cost threshold to 25 atm.

  • A) I actually use physical cores on NUMA for my MAXDOP number.

    B) Someone needs to convince your company that their physical resources/configuration (virtualized here?) is unacceptable for optimal performance. I will assume RAM is maxed out for your SQL Server version? Is it any reasonable fraction of your active data size?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (3/17/2016)


    A) I actually use physical cores on NUMA for my MAXDOP number.

    B) Someone needs to convince your company that their physical resources/configuration (virtualized here?) is unacceptable for optimal performance. I will assume RAM is maxed out for your SQL Server version? Is it any reasonable fraction of your active data size?

    They are convinced already, we had some issues so they believe me and purchasing of new server and changing provider is in progress. Our production db has almost 230GB of data (and growing quickly) and i set about 58GB of RAM for sql server only. I need to deal with what we have until we move, that means sorting out Financial Transactions table to make sure we won't run out of space before new server arrives.

  • That's really good news for you!! Hope you can keep the train on the tracks and on schedule until the spiffy new hardware is online!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (3/17/2016)


    But generally is it better to have data files on a slower drive and log files on SSD (I assume it depends)? Might be worth mentioning that there is asynchronous mirroring in place. Additionally please have a look at attached top wait stats

    1) You have it backwards. It is usually best to have logs on rotating media because tlog activity is mostly sequential, which isn't that bad on rotating disks. HOWEVER, if you are like most of my clients you have your disks so dang carved up and so many files (or file fragments) on them you never, EVER actually get sequential IO.

    2) WRITELOG waits are high. You NEVER want things to wait on completing a transaction, because that can mean that people can't buy your widgets from your website and they will go shop elsewhere. I note these COULD be from batch activities in which case as long as you are meeting your processing windows and not receiving other bad side effects then no one cares. The mirroring is clearly affected too a smaller degree.

    3) I am guessing you have default settings for Cost Threshold for Parallelism and MAXDOP? Likely WAY too much parallelism going on noting the CXPACKET waits.

    4) I highly recommend you consider getting a professional in for a performance review.

    Again, you're out in the field more and so I'll bow to the higher power here but my inclination is to disagree here, especially if WRITELOG waits are high. If it were me, I'd put the transaction logs on SSD at least until I could get the performance of the queries up to snuff. Then I'd start to work on the INSERTs and UPDATEs. Might be able to move the transaction logs of the SSDs after that but why not enjoy the extra performance?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 15K HDDs doing sequential large-block writes can often push 125MB/sec. A few disks of that and you have some very substantial write capabilities for tlog flushes. Note the HUGE caveat there of SEQUENTIAL (which I referenced in my initial statements too).

    If SSDs are not bottlenecked in any way between RAM and chips-on-disks then heck, just drop everything on them if you have the space. 🙂 But where I am space-constrained I will look to do the opposite the poster suggested unless my testings shows that the HDDs can't do squat for sequential 64K writes.

    Don't forget that the tlog stuff isn't buffer-pool harming either. What happens when you have to hit those rotating disks to serve up actual data a lot because you have way more active size than RAM?

    Something tickles the back of my mind that Paul Randal even posted something in the past about don't default to using SSDs for tempdb and tlogs.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Jeff Moden (3/14/2016)


    Partitioned views are available in the Standard Edition and I actually prefer them over Partitioned Tables for many reasons.

    This is a good option for the non-enterprise edition peoples. Works great, but relies on primary keys.

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply