Track source dates when loading a data warehouse
A primer on how to reduce network and source system load when reading a relational source into the data warehouse.
2012-07-30
5,677 reads
A primer on how to reduce network and source system load when reading a relational source into the data warehouse.
2012-07-30
5,677 reads
SQL Server 2008 introduced many new functional and performance improvements for data warehousing, and SQL Server 2008 R2 includes all these and more. This paper discusses how to use SQL Server 2008 R2 to get great performance as your data warehouse scales up. We present lessons learned during extensive internal data warehouse testing on a 64-core HP Integrity Superdome during the development of the SQL Server 2008 release, and via production experience with large-scale SQL Server customers. Our testing indicates that many customers can expect their performance to nearly double on the same hardware they are currently using, merely by upgrading to SQL Server 2008 R2 from SQL Server 2005 or earlier, and compressing their fact tables. We cover techniques to improve manageability and performance at high-scale, encompassing data loading (extract, transform, load), query processing, partitioning, index maintenance, indexed view (aggregate) management, and backup and restore.
2011-05-19
5,175 reads
Data warehousing and general reporting applications tend to be CPU intensive because they need to read and process a large number of rows. To facilitate quick data processing for queries that touch a large amount of data, Microsoft SQL Server exploits the power of multiple logical processors to provide parallel query processing operations such as parallel scans. Through extensive testing, we have learned that, for most large queries that are executed in a parallel fashion, SQL Server can deliver linear or nearly linear response time speedup as the number of logical processors increases. However, some queries in high parallelism scenarios perform suboptimally. There are also some parallelism issues that can occur in a multi-user parallel query workload. This white paper describes parallel performance problems you might encounter when you run such queries and workloads, and it explains why these issues occur. In addition, it presents how data warehouse developers can detect these issues, and how they can work around them or mitigate them.
2010-12-10
4,645 reads
A data mart provides the primary access to the data stored in the data warehouse or operational data store. It is a subset of data sourced from the data warehouse or operational data store specifically focused on a business function or set of related business functions. Read on to learn the answers to fundamental questions about data marts.
2010-12-03
5,164 reads
The start of a new series from Leo Peysakhovich that looks at some of the issues with moving data around between systems and ensuring that it is in sync between them.
2010-07-26
6,822 reads
One of the most integral components and critical success factors of any enterprise data warehousing initiative is the Solutions Architecture document, a high-level conceptual model of a data warehousing solution. Learn why this collaborative effort that addresses the needs of all major stakeholders, including both the business units and Information Technology (IT), is essential.
2010-07-09
2,224 reads
The staging area tends to be one of the more overlooked components of a data warehouse architecture, and yet it is an integral part of the ETL component design. Learn why it is best to design the staging layer right the first time, enabling support of various ETL processes and related methodology, recoverability and scalability.
2010-07-07
6,197 reads
Denise Rogers discusses the essential tasks in conducting effective software evaluations revolving around data warehousing and business intellegence. Each step has a dependency on the previous one, starting with establishing the framework of the evaluation and adding progressively elaborate data that facilitates a decision making process that is resolute.
2010-06-11
4,369 reads
Data warehouse loads can be time consuming - this method can be used in some instances to help speed things up.
2010-04-14
16,908 reads
Why neglecting slowly changing dimensions, failing to capture metadata and overlooking scope creep can be the undoing of a dimensional data warehousing initiative.
2010-03-11
4,390 reads
By Steve Jones
Don’t reserve your kindest praise for a person until their eulogy. Tell them while...
By Brian Kelley
I thought it would be good to put my thoughts down on how to...
You want the short answer? Well, the only antidote to knowledge stagnation is probably...
Hi, we have a few people who like to experiment on our prod sql...
Comments posted to this topic are about the item Guidelines and Requirements
Comments posted to this topic are about the item Copying Production Schema
If I use DBCC CLONEDATABASE, can I remove some of the information from the copy?
See possible answers