Guest Editorial: That ain't a Database, it's a Spreadsheet

  • If your production system is going to have a particular size and throughput,

    I work in an environment that fits the scenario that Phil describes.

    I receive about one million rows per day, although a high volume day may be 1.5M rows and a very light day might be 250,000 rows. Each raw record has around 350 columns. The data is joined with other tables (ID -> name, location, etc.) and summarized for reporting via a .NET interface. All of this runs automatically through scheduled jobs, triggers, and stored procedures. The system is up 24x7x365.

    We implemented this about 30 months ago. At the 18 month point we began running into problems: transaction log full, running out of disk space, and flakey performance. Ex: a stored procedure would normally run for 2 minutes but would occasionally still be running after 45 minutes. You could kill the job and restart it and it might either run in 2 minutes or still take 45 minutes. The next day it would run in 2 minutes.

    It seemed that SQL Server became a different animal when tables exceeded 100 million rows. Write a good query and you got your answer in seconds, but write it the wrong way and 48 hours later it was still thrashing.

    I have to agree with Clement:

    My point is very simple, keep the retention period as low as possible on a production environment, dealing with transactional databases

    Yes, we did lots of other tuning but ultimately a data retention policy was the key to getting stable performance.

    I was able to work with the business on data retention once they saw the performance hit and reliability issues when the database exceeded a certain size. They looked at the cost of upgrading hardware (primarily increased RAID capacity) vs. how often they actually used older transactional data. I was able to show:

    6 months on line: $0 for hardware upgrades

    1 year on line: $X for hardware upgrades

    2 years on line: $Z for hardware upgrades

    Since they understood the costs in terms of money and performance, there was solid buy-in from the business for the data retention policy. Because of the retention policy the database tables are fairly stable in size, and queries and jobs run consistently.

    - Randall Newcomb

  • Wayne West (12/19/2008)


    I'm envious, closed until the 5th?!

    Not so good, really. The main reason for closing the office is that the global organization announced a restructuring in early August (just prior to the world economy going pear-shaped) which included the closure in 2009 of the office where I work. Phase 1 is now complete, so a large number of staff left just before Christmas, and the remaining people were given a few days extra holiday.

    Hence the hope that 2009 is better than 2008!

    Derek

  • Derek,

    Here's hoping things go well in 2009! And you enjoyed the time off.

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply