Stairway to Advanced T-SQL Level 4: Using Views to Simplify Your Query

  • Comments posted to this topic are about the item Stairway to Advanced T-SQL Level 4: Using Views to Simplify Your Query

    Gregory A. Larsen, MVP

  • Thanks Gregory

    Nice and straightforward, I read through quickly just to check there was nothing new to me.

    I write this to applaud your inclusion of the review questions at the end, I have been seeing more of this and I really think it helps understanding and retention.

  • Something interesting would be the new challenge of keeping up with the many views that can now be created against a single table. Such as to display one set of columns to one department, another set to another, one for updating , another one with joins,...etc... Do you have a set naming convention that you use to distinguish the views at the least by what their purpose is? Thanks, and good article.

    ----------------------------------------------------

  • There is a general misconception of the following statement that you make.

    "In this case SQL Server will still have to join all the tables in the view to return the data from a single table"

    In the "Enterprise" edition, the query optimizer is smart enough to know that the other tables are not needed and does NOT include them in the query plan. You can easily verify this by generating the "Actual Execution Plan", where you see that the unused tables ar not included in the plan.

    With this in mind, generating complex views with many table joins can be defined once (efficiently and correctly) and then used specifying only the desired field.

    For example.

    CREATE VIEW [dbo].[vwOrderInfo] AS

    SELECT <Fields from all tables>

    FROM OrderItems itm

    JOIN Products prd On prd.ProductId = itm.ProductID

    JOIN Brands bra ON bra.BrandID = prd.BrandID

    JOIN Orders ord ON ord.OrderID = itm.OrderID

    JOIN Customers cus ON cus.CustomerID = ord.CustomerID

    JOIN SalesReps rep ON rep.SaleRepID = cus.SalesRepID

    IF I need to report on a count of Orders By SalesRep, I can query;

    SELECT SalesRepName, COUNT(DISTINCT OrderID)

    FROM [dbo].[vwOrderInfo]

    GROUP BY SalesRepName

    Which would only use the tables Orders, Customers, SalesReps

    If I need to report the Total By Brand and product, I can query;

    SELECT BrandName, ProductID, ProductName, SUM(ItemUnit)

    FROM [dbo].[vwOrderInfo]

    GROUP BY BrandName, ProductID, ProductName

    Which would only use the tables OrderItems, Products and Brands

    The benefit is that this on ONE VIEW can serve many query situations efficiently and using the best query plan

    while hiding the specific internals of the joins. If the underlying table structures cjange and some of the joins have to be changed, only this ONE view would change.

    This is a very UNKNOWN feature that many people have misunderstood or had the misconception for a long time (myself included).

    I hope this is clear.

    Jose Ostos

  • As a general rule I do not like views and I discourage developers from using them. My position is based on experience back to SQL 6.5. I find that developers adopt Views as a crutch for every problem rather than a tool for specific problems.

    As a system grows, morphs, and ages the number of, and nesting level of views tends to expand beyond any reasonable limit. The 6.5 system I mentioned had views nested more than 6 levels deep and often the lower level views had no where clause.

    I have recently joined a team whose in house application suite also uses many views. Again, they are often nested several levels (I have found 3 levels) and typically have significant, undocumented business logic embedded in the column specs.

    I have a list of "Why do you need that" questions I pose to developers. Select Distinct and Views are near the top of the list.

    BTW, IMHO Code Reuse is not sufficient justification for choosing a tool with so much potential for trouble.

  • Like you, I did not like views for many years until I really knew what was going on under the hood and confirmed the the benefits for which they'r inteded. A lot has changed since SQL 6.5 so the topic deserves a seconf look into how it works now.

    What you're describing has more to do with improper use of views by developers than the views themselves. Nesting views is a BAD PRACTICE and should not be done period.

    Views should be looked at as "templates" that will tipically include all the possible fields and joins that could be related with the inderlying data. If you include everything in the complex view, there is no need for nesting, it's all included to begin with.

    For example, let's assume that I don't want to use views. I could use the QUERY of the example view above as a starting poit, or as a "template". If I were to create a query for any of the scenarios I mentioned earlier, I could copy the code of the full query of the view and then ELIMINATE the fields and the JOINS that weren't necessary and end up with an efficient query that's properly done and no excess baggage.

    Internally, that's EXACTLY what the query optimizer does giving the EXACT SAME result without having to copy and paste and most importantly, if this is done in multiple stored procedures, I don't need to repeat it multiple time. This becomes much more important in "Maintaining" the code later on.

    I don't look at it as code reuse, I look at it as avoiding doing the same thing over and over again.

  • Hi Jay,

    It seems that I did not get my point across. I completely agree that Nested Views are a BAD THING. And I understand your comments about templates, etc. I guess we could debate the semantics of "reusable code" through several rounds at a good watering hole.

    My point is that application developers tend to think in coding terms. Functions, methods, routines, loops, sequences, etc. They do not tend to think in terms of sets and set operations. I have had conversations with developers at several organizations including internal and third party. Most of them see no difference between creating a nested view and inheriting or overloading a method. And they have often and explicitly played the code reuse card to support their coding.

    Triggers, Views, User Defined Functions, SQLCLR, etc. are all powerful tools that can contribute to the solution when used properly. Unfortunately many application coders (VB, C#, etc.) jump on one of these tools an apply it to all problems.

    I did not mention it in my original reply but I have seen (and am dealing with) extensive use of triggers to enforce referential integrity and I have found triggers referencing nested views.

    I think the problem is as much caused by inexperience and the pace of development and change. I made the same mistake when table variables were first introduced. I thought they were THE SOLUTION and immediately began removing classic temporary tables from all of my scripts. It did not take long to find out that it was not such a good idea in all cases. Yet today, after nearly 10 years I still encounter developers who are convinced that Table Variables are a better choice in ALL CASES.

  • Also as a post done here about 4-5 months ago they cover partitioned views, which as Jay posted contradicts the post you have done about querie all tables on the views, also I think you overlook the encryption of views as a feature.

  • enriarg (4/3/2014)


    Also as a post done here about 4-5 months ago they cover partitioned views, which as Jay posted contradicts the post you have done about querie all tables on the views, also I think you overlook the encryption of views as a feature.

    Again, I did not say views are not a useful tool. I am currently implementing partitioned views. I also have created triggers and I use Cursors (a lot). I use Temp Tables, Table Variables, and any number of other specialized tools. I run SQL Profiler (both GUI and Scripted). I don't understand what there is in my comments that gives you the impression I am not aware of the capabilities and uses of views.

    What I am saying, is that application developers (3GL and 4GL programmers VB, C#, C++) tend to jump on views and functions as quick, easy, Code ReUsing, solutions to problems. And then the views tend to proliferate and spawn new views. In that sense they are bad. And the disease is hard to cure once it is in place.

    Here's another example. When I ask a developer why they used "Select Distinct" it is not because I don't think Distinct is a useful tool. It is because, over the years I have seen a number of very significant query logic, database design, data quality errors covered up by Select Distinct.

  • I posted a reply ONLY to clear up a comment that was made in the original article, and mainly because most people actually consider the comment in the article as accurate, when in fact its not.

    I know that there are many places where developers are allowed too much freedom to make database decisions which completely go against "Best Practices". Any object (Table, index, View, stored procedure, trigger etc.) that is implemented in a database should at least be reviewed and authorized by a competent database developer before it's implemented in a database. Otherwise, the database will eventually experience the kinds of problems that Ray is mentioning.

    I believe that the articles and forums should be to talk about the way things should be done right, following "Best Practices" and how the tools and techniques can help to improve when used properly. I think we have all seen plenty of "not so good" code and probably have done some ourselves before we discovered better ways to do things, its all part of the learning process and that's why we're here to begin with.

  • jay 87680 (3/20/2014)


    There is a general misconception of the following statement that you make.

    "In this case SQL Server will still have to join all the tables in the view to return the data from a single table"

    In the "Enterprise" edition, the query optimizer is smart enough to know that the other tables are not needed and does NOT include them in the query plan. You can easily verify this by generating the "Actual Execution Plan", where you see that the unused tables ar not included in the plan.

    With this in mind, generating complex views with many table joins can be defined once (efficiently and correctly) and then used specifying only the desired field.

    For example.

    CREATE VIEW [dbo].[vwOrderInfo] AS

    SELECT <Fields from all tables>

    FROM OrderItems itm

    JOIN Products prd On prd.ProductId = itm.ProductID

    JOIN Brands bra ON bra.BrandID = prd.BrandID

    JOIN Orders ord ON ord.OrderID = itm.OrderID

    JOIN Customers cus ON cus.CustomerID = ord.CustomerID

    JOIN SalesReps rep ON rep.SaleRepID = cus.SalesRepID

    IF I need to report on a count of Orders By SalesRep, I can query;

    SELECT SalesRepName, COUNT(DISTINCT OrderID)

    FROM [dbo].[vwOrderInfo]

    GROUP BY SalesRepName

    Which would only use the tables Orders, Customers, SalesReps

    If I need to report the Total By Brand and product, I can query;

    SELECT BrandName, ProductID, ProductName, SUM(ItemUnit)

    FROM [dbo].[vwOrderInfo]

    GROUP BY BrandName, ProductID, ProductName

    Which would only use the tables OrderItems, Products and Brands

    The benefit is that this on ONE VIEW can serve many query situations efficiently and using the best query plan

    while hiding the specific internals of the joins. If the underlying table structures cjange and some of the joins have to be changed, only this ONE view would change.

    This is a very UNKNOWN feature that many people have misunderstood or had the misconception for a long time (myself included).

    I hope this is clear.

    Jose Ostos

    Actually, no... it's not "clear" at all. I guess I'd need to know which version of SQL Server you're using because I just tested your claim on SQL Server 2000 (with a small mod to the generation code), 2005, and 2008 (non-R2), all of which are Enterprise Editions, and I find that your claim is absolutely untrue in those versions. I don't currently have access to 2012 EE or 2014 EE but, if someone does, please run the following and post the results. I'll also try another thing using an aggregate like you did. Thanks.

    So that others may easily test this, as well, here's code to make 3 nearly identical tables and join them up in a view. While it is true that the view is still updateable one table at a time, it still gets data from all 3 tables during a select regardless of which single table I select data from as show by STATISTICS IO and the Actual Execution Plan (which you'll need to turn on an see for yourself).

    --===== Because we drop tables at the end of this, do this in a nice, safe

    -- place that everyone has.

    USE tempdb;

    GO

    PRINT '

    --=============================================================================

    Create 3 simple tables with 10,000 rows each.

    --============================================================================='

    SELECT TOP 10000

    TableARowID = IDENTITY(INT,1,1)

    ,TableAVarcharCol = CAST(NULL AS VARCHAR(20))

    INTO dbo.TableA

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    GO

    SELECT TOP 10000

    TableBRowID = IDENTITY(INT,1,1)

    ,TableBVarcharCol = CAST(NULL AS VARCHAR(20))

    INTO dbo.TableB

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    GO

    SELECT TOP 10000

    TableCRowID = IDENTITY(INT,1,1)

    ,TableCVarcharCol = CAST(NULL AS VARCHAR(20))

    INTO dbo.TableC

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    GO

    PRINT '

    --=============================================================================

    -- Add a system-named PK to each table

    --============================================================================='

    ALTER TABLE dbo.TableA ADD PRIMARY KEY CLUSTERED (TableARowID);

    ALTER TABLE dbo.TableB ADD PRIMARY KEY CLUSTERED (TableBRowID);

    ALTER TABLE dbo.TableC ADD PRIMARY KEY CLUSTERED (TableCRowID);

    GO

    PRINT '

    --=============================================================================

    -- Create a view that reads all three tables

    --============================================================================='

    GO

    CREATE VIEW dbo.ThreeTables

    AS

    SELECT a.TableARowID, a.TableAVarcharCol

    ,b.TableBRowID, b.TableBVarcharCol

    ,c.TableCRowID, c.TableCVarcharCol

    FROM dbo.TableA a

    JOIN dbo.TableB b ON b.TableBRowID = a.TableARowID

    JOIN dbo.TableC c ON c.TableCRowID = b.TableBRowID

    ;

    GO

    PRINT '

    --=============================================================================

    -- Show that the view can be used to update the tables one at a time.

    --============================================================================='

    UPDATE dbo.ThreeTables

    SET TableAVarcharCol = 'TableA'+CAST(TableARowID AS VARCHAR(10))

    ;

    UPDATE dbo.ThreeTables

    SET TableBVarcharCol = 'TableB'+CAST(TableBRowID AS VARCHAR(10))

    ;

    UPDATE dbo.ThreeTables

    SET TableCVarcharCol = 'TableC'+CAST(TableCRowID AS VARCHAR(10))

    ;

    GO

    PRINT '

    --=============================================================================

    -- Show that the view cannot be used to update more than 1 table

    -- at a time.

    --============================================================================='

    GO

    UPDATE dbo.ThreeTables

    SET TableAVarcharCol = 'TableA'+CAST(TableARowID AS VARCHAR(10))

    ,TableBVarcharCol = 'TableB'+CAST(TableBRowID AS VARCHAR(10))

    ;

    GO

    PRINT '

    --=============================================================================

    -- Show that a non-partitioned view reading from just one table

    -- is a myth.

    --============================================================================='

    SET STATISTICS IO ON;

    PRINT '========== Select columns only from TableA...'

    SELECT TableARowID, TableAVarcharCol

    FROM dbo.ThreeTables

    ;

    PRINT '========== Select columns only from TableB...'

    SELECT TableBRowID, TableBVarcharCol

    FROM dbo.ThreeTables

    ;

    PRINT '========== Select columns only from TableC...'

    SELECT TableCRowID, TableCVarcharCol

    FROM dbo.ThreeTables

    ;

    SET STATISTICS IO OFF;

    GO

    PRINT '

    --=============================================================================

    -- Housekeeping

    --============================================================================='

    DROP VIEW dbo.ThreeTables

    ;

    GO

    DROP TABLE dbo.TableA, dbo.TableB, dbo.TableC

    ;

    Here's the output I get. Notice the SELECTS get data from all 3 tables

    [font="Courier New"]

    --=============================================================================

    Create 3 simple tables with 10,000 rows each.

    --=============================================================================

    (10000 row(s) affected)

    (10000 row(s) affected)

    (10000 row(s) affected)

    --=============================================================================

    -- Add a system-named PK to each table

    --=============================================================================

    --=============================================================================

    -- Create a view that reads all three tables

    --=============================================================================

    --=============================================================================

    -- Show that the view can be used to update the tables one at a time.

    --=============================================================================

    (10000 row(s) affected)

    (10000 row(s) affected)

    (10000 row(s) affected)

    --=============================================================================

    -- Show that the view cannot be used to update more than 1 table

    -- at a time.

    --=============================================================================

    Msg 4405, Level 16, State 1, Line 1

    View or function 'dbo.ThreeTables' is not updatable because the modification affects multiple base tables.

    --=============================================================================

    -- Show that a non-partitioned view reading from just one table

    -- is a myth.

    --=============================================================================

    ========== Select columns only from TableA...

    (10000 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableA'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableC'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ========== Select columns only from TableB...

    (10000 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableA'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableC'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ========== Select columns only from TableC...

    (10000 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableC'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableA'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    --=============================================================================

    -- Housekeeping

    --=============================================================================

    [/font]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I just added the following code to the example above, the second snippet of which matches your example, Jose.

    PRINT '========== Do an aggregate only from TableA...'

    SELECT TableASomeID, Total = SUM(TableASomeInt)

    FROM dbo.ThreeTables

    GROUP BY TableASomeID

    ;

    PRINT '========== Do an aggregate only from TableA'

    PRINT '========== with an additional GROUP BY for TableB'

    SELECT TableASomeID, TableBSomeID, Total = SUM(TableASomeInt)

    FROM dbo.ThreeTables

    GROUP BY TableASomeID, TableBSomeID

    ;

    Same results... all three tables came into play.

    [font="Courier New"]

    ========== Do an aggregate only from TableA...

    (50 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableA'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableC'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ========== Do an aggregate only from TableA

    ========== with an additional GROUP BY for TableB

    (2457 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableA'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableC'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    [/font]

    Stop and think about this... if what you claim where possible, we'd have to report it as a bug because tables are sometimes joined in views as a check for existence only where there's no return of that table's data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There is one scenario where what you claim is true but, from what I can see, it's only for a "no rows found" condition. Here's the code...

    PRINT '========== Do an aggregate only from TableA'

    PRINT '========== with an additional GROUP BY for TableB'

    PRINT '========== like before with a WHERE clause on TableB'

    PRINT '========== that causes no rows to be returned.'

    SELECT TableASomeID, TableBSomeID, Total = SUM(TableASomeInt)

    FROM dbo.ThreeTables

    WHERE TableBVarcharCol = 'TableB10001'

    GROUP BY TableASomeID, TableBSomeID

    ;

    Here's the output...

    [font="Courier New"]

    ========== Do an aggregate only from TableA

    ========== with an additional GROUP BY for TableB

    ========== like before with a WHERE clause on TableB

    ========== that causes no rows to be returned.

    (0 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    [/font]

    This type of thing is actually both normal and documented because it is documented that the WHERE clause is normally evaluated before the JOINs are.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok, I admit there is more to it than meets the eye and Jeff is right with the samples he provides. The reason that you are seeing that behavior in your sample is because In the Query engines needs ALL 3 tables to be able to FILTER the data correctly. We have to take into consideration that a JOIN is also a FILTER in the sense that it ONLY includes records that have the same key on BOTH tables of the JOIN. If your view is defined using the LEFT JOIN (that doesn't filter) instead of the JOIN (that filters) then the results are:

    ========== Select columns only from TableA...

    (10000 row(s) affected)

    Table 'TableA'. Scan count 1, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ========== Select columns only from TableB...

    (10000 row(s) affected)

    Table 'TableB'. Scan count 1, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableA'. Scan count 1, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ========== Select columns only from TableC...

    (10000 row(s) affected)

    Table 'TableA'. Scan count 1, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableC'. Scan count 1, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TableB'. Scan count 1, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -----------------------------------------------------------------------------

    This is just to show how the results can vary with a small change.

    ==============================================================

    I'm including a script to create the tables and the data to show what I am talking about in my original post where you do see the behavior that I mentioned previously. In this example, I will be using the JOIN and we do obtain the desired behavior. The key for this to work is the definition of the FOREIGN KEYS and the NOT NULL definition of the tables.

    For example, if a FOREIGN KEY exists (e.g. Products table's field [BrandID] MUST exist in table Brands) and it does NOT allow NULL values (IS NOT NULL), then the Query Analyzer doesn't have to perform the FILTER because it knows that the condition CAN NOT EXISTS by the definition of the rules (Foreign key and IS NOT NULL).

    Under these conditions, if no field from the [Brands] table is selected, there is no need to JOIN to the table because it knows that no product can exists that is either NULL nor has an value that doesn't exist in the [Brands] table. On the other hand, if these definitions don't exist, then the CAN NOT assume this and it MUST perform the JOIN to FILTER OUT those values.

    The results of the 2 queries I originally mentioned originally are:

    -------------------------------

    (2 row(s) affected)

    Table 'SalesReps'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Customers'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Orders'. Scan count 0, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'OrderItems'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    (4 row(s) affected)

    Table 'OrderItems'. Scan count 1, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Brands'. Scan count 0, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Products'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    --------------------------------------------------

    I hope that with this sample you can check the statistics and the query plan to analyze and understand how the "query Analyzer" is working. If you have any doubts, don't hesitate to ask and I'll try to explain why it's behaving that way, or you can make some comments of your discoveries here.

    Long story short, you can't just say that all tables from a view are always going to be included (like the original article said) nor are they always going to be excluded (like I originally said). It DEPENDS on the conditions. What may be said is that if the engine can GUARANTEE the same results by excluding the table, it will exclude it.

    Jose Ostos

    -----------------------------------------------------------

    -- SCRIPT

    -----------------------------------------------------------

    USE [tempdb] -- Or the database where you want to test this

    GO

    ---------------------------------------------------------------

    -- DROP Objects if they already exist

    ---------------------------------------------------------------

    IF OBJECT_ID('vwOrderInfo') IS NOT NULL

    DROP VIEW vwOrderInfo;

    GO

    IF OBJECT_ID('OrderItems') IS NOT NULL

    DROP TABLE OrderItems

    IF OBJECT_ID('Orders') IS NOT NULL

    DROP TABLE Orders

    IF OBJECT_ID('Products') IS NOT NULL

    DROP TABLE Products

    IF OBJECT_ID('Brands') IS NOT NULL

    DROP TABLE Brands

    IF OBJECT_ID('Customers') IS NOT NULL

    DROP TABLE Customers

    IF OBJECT_ID('SalesReps') IS NOT NULL

    DROP TABLE SalesReps

    ---------------------------------------------------------------

    -- Create Objects for the example (Tables)

    ---------------------------------------------------------------

    CREATE TABLE SalesReps (

    SalesRepID SMALLINT NOT NULL

    , SalesRepName VARCHAR(40)

    , CONSTRAINT PK_SalesReps PRIMARY KEY(SalesRepID)

    );

    CREATE TABLE Customers (

    CustomerID SMALLINT NOT NULL

    , CustomerName VARCHAR(40)

    , SalesRepID SMALLINT NOT NULL

    , CONSTRAINT PK_Customers PRIMARY KEY(CustomerID)

    );

    CREATE TABLE Brands (

    BrandID SMALLINT NOT NULL

    , BrandName VARCHAR(40)

    , CONSTRAINT PK_Brand PRIMARY KEY(BrandID)

    );

    CREATE TABLE Products (

    ProductID SMALLINT NOT NULL

    , ProductName VARCHAR(40)

    , BrandID SMALLINT NOT NULL

    , CONSTRAINT PK_Products PRIMARY KEY(ProductID)

    );

    CREATE TABLE Orders (

    OrderID INT NOT NULL

    , OrderDate DATE NOT NULL

    , CustomerID SMALLINT NOT NULL

    , CONSTRAINT PK_Orders PRIMARY KEY(OrderID)

    );

    CREATE TABLE OrderItems (

    OrderItemID INT NOT NULL

    , OrderID INT NOT NULL

    , ProductID SMALLINT NOT NULL

    , ItemUnit INT

    , ItemPrice DECIMAL(13,2)

    , CONSTRAINT PK_OrderItems PRIMARY KEY(OrderItemID)

    );

    ---------------------------------------------------------------

    -- Create Objects for the example (Foreign Keys)

    ---------------------------------------------------------------

    ALTER TABLE Customers WITH CHECK ADD CONSTRAINT FK_Customer_SalesRep_SalesRepID FOREIGN KEY(SalesRepID)

    REFERENCES SalesReps (SalesRepID)

    GO

    ALTER TABLE Products WITH CHECK ADD CONSTRAINT FK_Product_Brand_BrandID FOREIGN KEY(BrandID)

    REFERENCES Brands (BrandID)

    GO

    ALTER TABLE Orders WITH CHECK ADD CONSTRAINT FK_Order_Customer_CustomerID FOREIGN KEY(CustomerID)

    REFERENCES Customers (CustomerID)

    GO

    ALTER TABLE OrderItems WITH CHECK ADD CONSTRAINT FK_OrderItem_Order_OrderID FOREIGN KEY(OrderID)

    REFERENCES Orders (OrderID)

    GO

    ALTER TABLE OrderItems WITH CHECK ADD CONSTRAINT FK_OrderItem_Product_ProductID FOREIGN KEY(ProductID)

    REFERENCES Products (ProductID);

    GO

    ---------------------------------------------------------------

    -- Create Objects for the example (View)

    ---------------------------------------------------------------

    CREATE VIEW vwOrderInfo AS

    SELECT itm.OrderItemID, itm.ItemUnit, itm.ItemPrice

    , ord.OrderID, ord.OrderDate

    , prd.ProductID, prd.ProductName

    , bra.BrandID, bra.BrandName

    , cus.CustomerID, cus.CustomerName

    , rep.SalesRepID, rep.SalesRepName

    FROM OrderItems itm

    JOIN Products prd On prd.ProductId = itm.ProductID

    JOIN Brands bra ON bra.BrandID = prd.BrandID

    JOIN Orders ord ON ord.OrderID = itm.OrderID

    JOIN Customers cus ON cus.CustomerID = ord.CustomerID

    JOIN SalesReps rep ON rep.SalesRepID = cus.SalesRepID;

    GO

    ---------------------------------------------------------------

    -- Insert data into the tables

    ---------------------------------------------------------------

    INSERT INTO SalesReps ( SalesRepID, SalesRepName ) VALUES ( 1, 'SalesRep A' )

    INSERT INTO SalesReps ( SalesRepID, SalesRepName ) VALUES ( 2, 'SalesRep B' )

    INSERT INTO Customers ( CustomerID, CustomerName, SalesRepID) VALUES ( 1 , 'Customer M', 1 )

    INSERT INTO Customers ( CustomerID, CustomerName, SalesRepID) VALUES ( 2 , 'Customer N', 1 )

    INSERT INTO Customers ( CustomerID, CustomerName, SalesRepID) VALUES ( 3 , 'Customer O', 2 )

    INSERT INTO Customers ( CustomerID, CustomerName, SalesRepID) VALUES ( 4 , 'Customer P', 2 )

    INSERT INTO Brands ( BrandID, BrandName ) VALUES ( 1, 'Brand AA' )

    INSERT INTO Brands ( BrandID, BrandName ) VALUES ( 2, 'Brand BB' )

    INSERT INTO Products ( ProductID, ProductName, BrandID) VALUES ( 1 , 'Product E', 1 )

    INSERT INTO Products ( ProductID, ProductName, BrandID) VALUES ( 2 , 'Product F', 2 )

    INSERT INTO Products ( ProductID, ProductName, BrandID) VALUES ( 3 , 'Product G', 1 )

    INSERT INTO Products ( ProductID, ProductName, BrandID) VALUES ( 4 , 'Product H', 2 )

    INSERT INTO Orders ( OrderID, OrderDate, CustomerID ) VALUES ( 101, '2014-01-05', 1 )

    INSERT INTO Orders ( OrderID, OrderDate, CustomerID ) VALUES ( 102, '2014-01-05', 2 )

    INSERT INTO Orders ( OrderID, OrderDate, CustomerID ) VALUES ( 103, '2014-01-06', 3 )

    INSERT INTO Orders ( OrderID, OrderDate, CustomerID ) VALUES ( 104, '2014-01-07', 4 )

    INSERT INTO Orders ( OrderID, OrderDate, CustomerID ) VALUES ( 105, '2014-01-08', 1 )

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1001 , 101 , 1 , 5 , 45.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1002 , 101 , 2 , 3 , 89.00)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1003 , 101 , 3 , 1 , 21.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1004 , 101 , 4 , 2 , 37.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1005 , 102 , 1 , 6 , 29.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1006 , 102 , 4 , 8 , 13.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1007 , 103 , 2 , 5 , 83.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1008 , 103 , 3 , 7 , 74.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1009 , 103 , 4 , 4 , 28.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1010 , 104 , 1 , 3 , 43.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1011 , 104 , 2 , 2 , 73.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1012 , 104 , 3 , 8 , 28.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1013 , 105 , 2 , 9 , 96.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1014 , 105 , 3 , 6 , 24.50)

    INSERT INTO OrderItems ( OrderItemID, OrderID, ProductID, ItemUnit, ItemPrice ) VALUES ( 1015 , 105 , 4 , 3 , 72.50)

    ---------------------------------------------------------------

    -- Sample Queries

    ---------------------------------------------------------------

    SELECT SalesRepName, COUNT(DISTINCT OrderID) Orders

    FROM [dbo].[vwOrderInfo]

    GROUP BY SalesRepName

    ---------------------------------------------------------------

    SELECT BrandName, ProductID, ProductName, SUM(ItemUnit) Units

    FROM [dbo].[vwOrderInfo]

    GROUP BY BrandName, ProductID, ProductName

  • The direction of the discussion has strayed a little bit from the original topic:-P

    The behavior you guys are discussing is not specific to Views. it is general behavior of the query engine. The query engine will eliminate tables from any query when Referential Integrity is full declared and the various constraints are trusted. Table Elimination is a feature of most DB engines (you can find web references to Oracle and others).

    I am not saying the discussion is not interesting, I just want to point out that the behavior is not specifically caused by Views. It is more related to fully implemented DRI.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply