Dimensional Modeling Case Study Part 2 - Days Dimension
Learn how you can model days in a dimension that might need to be aggregated in different ways for your data warehouse operations.
2024-02-07
1,822 reads
Learn how you can model days in a dimension that might need to be aggregated in different ways for your data warehouse operations.
2024-02-07
1,822 reads
This article gives an overview of Amazon Redshift, the cloud data warehouse in AWS.
2023-07-24
25,373 reads
In 2019 Canadian Broadcasting Corporation (CBC) news reported a massive data breach at the Desjardins Group, which is a Canadian financial service cooperative and the largest federation of credit unions in North America. The report indicated, a "malicious" employee copied sensitive personal information collected by Desjardins from their data warehouse. The data breach compromised the […]
2022-03-30
2,927 reads
Agile data warehousing can be challenging. Pairing the right methodologies and tools can help. Here is how my team met the challenge by using Data Vault methodology and BIML scripting.
2014-09-04
2,582 reads
2014-07-24
1,626 reads
In this article, Arshad Ali goes intp detail about how a data warehouse is different from operational data store and the different design methodologies for a data warehouse.
2013-07-03
6,374 reads
A primer on how to reduce network and source system load when reading a relational source into the data warehouse.
2012-07-30
5,293 reads
SQL Server 2008 introduced many new functional and performance improvements for data warehousing, and SQL Server 2008 R2 includes all these and more. This paper discusses how to use SQL Server 2008 R2 to get great performance as your data warehouse scales up. We present lessons learned during extensive internal data warehouse testing on a 64-core HP Integrity Superdome during the development of the SQL Server 2008 release, and via production experience with large-scale SQL Server customers. Our testing indicates that many customers can expect their performance to nearly double on the same hardware they are currently using, merely by upgrading to SQL Server 2008 R2 from SQL Server 2005 or earlier, and compressing their fact tables. We cover techniques to improve manageability and performance at high-scale, encompassing data loading (extract, transform, load), query processing, partitioning, index maintenance, indexed view (aggregate) management, and backup and restore.
2011-05-19
5,175 reads
Data warehousing and general reporting applications tend to be CPU intensive because they need to read and process a large number of rows. To facilitate quick data processing for queries that touch a large amount of data, Microsoft SQL Server exploits the power of multiple logical processors to provide parallel query processing operations such as parallel scans. Through extensive testing, we have learned that, for most large queries that are executed in a parallel fashion, SQL Server can deliver linear or nearly linear response time speedup as the number of logical processors increases. However, some queries in high parallelism scenarios perform suboptimally. There are also some parallelism issues that can occur in a multi-user parallel query workload. This white paper describes parallel performance problems you might encounter when you run such queries and workloads, and it explains why these issues occur. In addition, it presents how data warehouse developers can detect these issues, and how they can work around them or mitigate them.
2010-12-10
4,645 reads
A data mart provides the primary access to the data stored in the data warehouse or operational data store. It is a subset of data sourced from the data warehouse or operational data store specifically focused on a business function or set of related business functions. Read on to learn the answers to fundamental questions about data marts.
2010-12-03
5,164 reads
By Steve Jones
I love Chicago. I went to visit three times in 2023: a Redgate event,...
By Brian Kelley
I have found that non-functional requirements (NFRs) can be hard to define for a...
You can find the slidedeck for my Techorama session “Microsoft Fabric for Dummies” on...
Hello, I have a question regarding Availability group server architecture. A little background: We...
Testing with AG on Linux with Cluster=NONE. it was all going ok and as...
Hi, I have two tables: one for headers with 9 fields and another for...
Let’s consider the following script that can be executed without any error on both SQL Sever and PostgreSQL. We define the table t1 in which we insert three records:
create table t1 (id int primary key, city varchar(50)); insert into t1 values (1, 'Rome'), (2, 'New York'), (3, NULL);If we execute the following query, how will the records be sorted in both environments?
select city from t1 order by city;See possible answers