Self referring join condition (tblx.col1 = tblx.col1) alters perf. - WHY?

  • Can someone explain WHY adding the self referring join condition on acct.BillingType decreases the Estimated number of rows by a factor of 10?

    Notes:

    -- #HindsightFacilityList table has one row with four integers in it and no indexes.

    -- factARSnapshot (30 million rows/4 GB Data size) DDL Attached

    -- dimAccount (5 million rows/1.5 GB) DDL Attached

    Both queries are correctly using the ix_dimAccount index?!

    ---- Qry3_With_Self_Join

    SELECT

    atb.AccountID

    ,atb.FacilityID

    ,atb.DischargeAgingID

    ,atb.FinancialClassID

    ,atb.InsuranceProviderID

    ,LEFT(acct.BillingType,1)

    ,atb.AccountBalance

    FROM #HindsightFacilityList FacList

    INNER JOIN Analysis.factARSnapshot AS atb

    ON facList.AccountFacilityID = atb.FacilityID

    AND atb.PeriodID=@pm_PeriodID

    INNER JOIN Analysis.dimAccount AS acct

    ON acct.AccountID = atb.AccountID

    AND acct.BillingType = acct.BillingType

    ---- Qry4_WithOUT_Self_Join

    SELECT

    atb.AccountID

    ,atb.FacilityID

    ,atb.DischargeAgingID

    ,atb.FinancialClassID

    ,atb.InsuranceProviderID

    ,LEFT(acct.BillingType,1)

    ,atb.AccountBalance

    FROM #HindsightFacilityList FacList

    INNER JOIN Analysis.factARSnapshot AS atb

    ON facList.AccountFacilityID = atb.FacilityID

    AND atb.PeriodID=@pm_PeriodID

    INNER JOIN Analysis.dimAccount AS acct

    ON acct.AccountID = atb.AccountID

    ---AND acct.BillingType = acct.BillingType

    ** Execution plan attached

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Just a shot in the dark, but - how many have a NULL billingType?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (#4) (9/11/2012)


    Just a shot in the dark, but - how many have a NULL billingType?

    Zero, it's a NOT NULL column and the results are identical. [EDIT] And by results I mean the results of the two queries[/EDIT]

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • BUMP

    Does anyone have an idea on this?

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • I'm gonna try one more time to bump this and see if someone can find the answer. :hehe:

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Jason Selburg (9/10/2012)


    Can someone explain WHY adding the self referring join condition on acct.BillingType decreases the Estimated number of rows by a factor of 10?

    Offhand... no. The predicate exists on the object in one and not the other, and that's obviously affecting things, but it makes no particular difference. It IS a NULL killer as mentioned above, but that's irrelevant to the results obviously.

    Return is 75656 in either case. There's a VERY small handful of people I can think of offhand that might actually know where the extra row counts are coming from and what the optimizer is doing under the hood. At a guess, it's using a different statistic set for estimation due to the connection (in any method) to that field as a restrictor and is getting a better value for the estimate, which might indicate it's time for an UPDATE STATISTICS WITH FULLSCAN on that table.

    Otherwise, if it's really that much of a concern since it's neither affecting the query plan nor did you mention significant runtime differences, I'd recommend you PM Paul White, Grant Fritchey, and Gail Shaw and ask them if they'd look in on this thread and can ask you the right questions to figure out the puzzle completely. There might be one or two others who know the under the hood mechanics well enough kicking around but those three come to mind immediately to me for optimization plan gurus.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Ah, I hadn't thought about stats. I'll PM those guys cause this one has made me determined to solve it.

    Yes, there was a perf difference, I can't recall now. It wasn't huge, but big enough for me to notice.

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply