SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SQL Server 2012 VLDB Data Warehouse Design Question


SQL Server 2012 VLDB Data Warehouse Design Question

Author
Message
Anthony Stephens
Anthony Stephens
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 91
Appreciate feedback to this question. I'm working on a design for VLDB that will be in the several TB size range. The load activity will be vary infrequently as the data will be for research purposes only. People will not have to make business critical decision with this data nor will there be any OLTP activity against the data. In fact data queries will be mostly in the form of an export which also will not happen daily. Frequency of the export will most likely be along the lines of bi-monthly or quarterly.

That said, I'm having a hard time to justify the use of partitioning and even file-groups seem like a stretch in term of producing any real performance benefits. The Cost/Benefit (C/B) breakdown doesn't appear to offer much of a compiling advantage. Yes, the data repository will be large. But I 'm thinking with the right performing storage, CPU and Memory, the export time-to-completion should be fairly reasonable without the use of these strategies.

Am I missing something in my thought process? Given these requirements should I consider the use of partition or filegroups?
Thomas LeBlanc
Thomas LeBlanc
SSCarpal Tunnel
SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)

Group: General Forum Members
Points: 4631 Visits: 915
Partitioning is mainly for incrementally loading data by say a date column. If the Monthly or quarterly data is new in terms of in the last month time stamped and no update to previously imported data, there is value.

File Groups are used to separate the files in the database on different LUNs (SAN) or hard drives. That would be your only benefit if you have separate drives or LUNs to place the files in file groups.

For performance, you might want to look into Indexing strategies. I have seen huge performance improvements on loading the data in a data warehouse where a fact table was completely rebuilt and we dropped the indexes first, truncated the table, loaded the data and added the indexes.

The table was mostly duplicated from an OLTP system, but the indexes on the OLTP where different than the fact table because the reporting and querying where different in the 2 systems.

Thomas LeBlanc, MCITP DBA 2005, 2008 & MCDBA 2000
http://thesmilingdba.blogspot.com/
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14837 Visits: 4639
Anthony Stephens (12/16/2012)
Appreciate feedback to this question. I'm working on a design for VLDB that will be in the several TB size range. The load activity will be vary infrequently as the data will be for research purposes only. People will not have to make business critical decision with this data nor will there be any OLTP activity against the data. In fact data queries will be mostly in the form of an export which also will not happen daily. Frequency of the export will most likely be along the lines of bi-monthly or quarterly.

That said, I'm having a hard time to justify the use of partitioning and even file-groups seem like a stretch in term of producing any real performance benefits. The Cost/Benefit (C/B) breakdown doesn't appear to offer much of a compiling advantage. Yes, the data repository will be large. But I 'm thinking with the right performing storage, CPU and Memory, the export time-to-completion should be fairly reasonable without the use of these strategies.

Am I missing something in my thought process? Given these requirements should I consider the use of partition or filegroups?


Table partitioning rarely brings any performance advantages to the table - do not even think on table partitioning if the reason is "performance".

Different filegroups, as stated on previous post would help IF, and this is a VERY BIG "IF", if tables are distributed in a way that ensures I/O load balancing otherwise, hot table might end up on the same filegroup then on the same LUN therefore defeating the purpose.

Reading the specifications of the project it looks to me like "performance" is not really a top priority and, if it is, then proper data modeling and physical implementation including a reasonable indexing strategy would do the trick.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
MMartin1
MMartin1
SSCertifiable
SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)SSCertifiable (7.5K reputation)

Group: General Forum Members
Points: 7461 Visits: 2033
File groups could help if you do frequent data backups on this VLDB where you can back up each devise simultaneously. I don't read that you need to do this.

----------------------------------------------------
How to post forum questions to get the best help
tmitchelar
tmitchelar
Say Hey Kid
Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)

Group: General Forum Members
Points: 695 Visits: 508
Anthony,

You are correct in your thinking - I don't see any reason to implement any performance strategies to your data warehouse. It would be a waste of your time given the generous amounts of time between actions.
EdVassie
EdVassie
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14701 Visits: 3905
If you are designing a multi-TB database and do not understand all the design issues you need to pay to outside help. Period.

When processing a database of this size, a 2% difference in performance can result in extra hours of processing. Unless you really know how to fine tune your design to get the last 1% or 2% out of it, you almost certainly will end up with something that is either very slow or is using a few $100k more kit than it needs.

IMHO you should build a business case for your employer to engage an external consultant to help you with these issues. There are a number of high-reputation organisations in the market that can do this work, but if you are really stuck for names ask your Microsoft Account Manager for advice.

Original author: SQL Server FineBuild 1-click install and best practice configuration of SQL Server 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005. 14 Mar 2017: now over 40,000 downloads.Disclaimer: All information provided is a personal opinion that may not match reality.Quote: When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist. - Archbishop Hélder Câmara
tmitchelar
tmitchelar
Say Hey Kid
Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)Say Hey Kid (695 reputation)

Group: General Forum Members
Points: 695 Visits: 508
EdVassie (4/2/2013)
If you are designing a multi-TB database and do not understand all the design issues you need to pay to outside help. Period.

When processing a database of this size, a 2% difference in performance can result in extra hours of processing. Unless you really know how to fine tune your design to get the last 1% or 2% out of it, you almost certainly will end up with something that is either very slow or is using a few $100k more kit than it needs.

IMHO you should build a business case for your employer to engage an external consultant to help you with these issues. There are a number of high-reputation organisations in the market that can do this work, but if you are really stuck for names ask your Microsoft Account Manager for advice.


The loads will be, by his word, very infrequent. It sounds like he has the luxury of extra hours of processing. I'm a huge proponent of outside consulting, but for all we know the OP has the skills to successfully implement a coherent performance strategy. Why spend money on consultants when the benefits won't be noticed due to the infrequency of loads and flat file extracts? I don't see the cost outweighing the benefits in this particular case.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search