SQL Data Warehouse reference architectures

With so many product options to choose from for building a big data solution in the cloud, such as SQL Data Warehouse (SQL DW), Azure Analysis Services (AAS), SQL Database (SQL DB), and Azure Data Lake (ADL), there are various combinations of using the products, each with pros/cons along with differences in cost. With many customers looking at using SQL DW, I wanted to mention various reference architectures that I have seen, ordered by most cost to lowest cost:

Do staging, data refinement and reporting all from SQL DW. You can scale compute power up when needed (i.e. during staging, data refinement, or large number of users doing reporting) or down to save costs (i.e. nights and weekends when user reporting is low). The pros of this option are by reducing the number of technologies you are building a simpler solution and reducing the number of copies of the data. The cons are since everything is done on SQL DW you can have performance issues (i.e. doing data refinement while users are reporting), can hit the SQL DW concurrent query limit, and can have a higher cost since SQL DW is the highest-cost product, especially if you are not able to pause it. Pausing it reduces your cost to zero for compute, only having to pay for storage (see Azure SQL Data Warehouse pricing), but no one can use SQL DW when paused
Do staging and data refinement in a Hadoop data lake, and then copy all or part of the data to SQL DW to do reporting. This saves costs in SQL DW by offloading data refinement, and gives you the benefit of using a data lake (see What is a data lake? and Why use a data lake?). You save costs by not having to scale up SQL DW to do the data refinement (scaling up would minimize affecting reporting performance and refine data quicker) and by not having to store as much data in SQL DW. You also save costs by archiving data in the data lake and using PolyBase to access it (be aware these queries could be slow as PolyBase does not support query pushdown in SQL DW). A con of this architecture is having an extra copy of the data along with the extra ETL needed
Do staging and data refinement in SQL DW, and copy some or all data to one or more data marts (in SQL DB or SQL Server in a VM) and/or one or more cubes (in AAS or SSAS in a VM) for reporting, which is considered a “Hub-and-Spoke” model. Scale down SQL DW after data refinement and use it for a limited amount of big queries. This overcomes the SQL DW concurrent query limit by having users query the data mart/cube and saves costs by querying less expensive options. You also get the benefits that come with a cube such as creating a semantic layer and row-level security that is not available in SQL DW (see Why use a SSAS cube?). This architecture can also be combined with the previous architecture to add in a data lake. A con of this architecture is having extra copies of the data along with the extra ETL needed
Do staging and data refinement in SQL DW, and copy all data to a data mart (SQL DB or SQL Server in a VM) and/or a cube (AAS or SSAS in a VM) for reporting. Pause SQL DW after the staging and data refinement is done. This is used when giving users access to SQL DW will impact ELT and/or user queries wouldn’t be as responsive as needed, or when cost is a top priority (you only pay for storage costs when SQL DW is paused). A con of this architecture is having extra copies of the data along with the extra ETL needed, and not having SQL DW available for big queries

More info:

Using SQL Server to Build a Hub-and-Spoke Enterprise Data Warehouse Architecture

Hub-And-Spoke: Building an EDW with SQL Server and Strategies of Implementation

Common ISV application patterns using Azure SQL Data Warehouse

Azure SQL Data Warehouse Workload Patterns and Anti-Patterns

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

SQL Data Warehouse reference architectures

Rate

Share

Share

Rate

SQL Data Warehouse reference architectures

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts