In-Line TVF Split Function JOIN to Base Tables Optimization

  • I have a specific question about JOINing results of an in-line split function to base tables

    I am using an in-line TVF and joining to base tables. Sample code below.

    The problem is the optimizer does not know how many records the TVF returns and so ends up with base table scans and parallelism.

    If I INSERT the results of the TVP into a table var then JOIN on that table var, all is well b/c the optimizer has a count --- but that defeats the point of an in-line TVF.

    I did some research on this but am not seeing much discussion.

    Any ideas much appreciated.

    SELECT

    MyBaseTable.KeyColumn

    ,MyBaseTable.Col1

    ,MyBaseTable.Col2

    FROM

    MyBaseTable

    INNER JOIN dbo.fnListToTable8K_InLine(@List,',') ListAsTable

    ON ListAsTable.ListItem = MyBaseTable.KeyColumn

  • Thx David

    I read the post you cited. This is the key line from it.

    "3.Use inline TVF when possible: In the demo, it’s unnecessary to use a multi-statement TVF. By changing it to inline TVF, the estimates will be accurate. "

    Understood in-line is better and I already am using an in-line TVF. The in-line TVF I am using splits a string into a table but the optimizer does not know the number of records the TVF will return because it depends on the length of the string and number of delimiters. So for the JOIN to base tables, the optimizer opts for base table scans and parallelism.

  • regan.wick (8/2/2016)


    Thx David

    I read the post you cited. This is the key line from it.

    "3.Use inline TVF when possible: In the demo, it’s unnecessary to use a multi-statement TVF. By changing it to inline TVF, the estimates will be accurate. "

    Understood in-line is better and I already am using an in-line TVF. The in-line TVF I am using splits a string into a table but the optimizer does not know the number of records the TVF will return because it depends on the length of the string and number of delimiters. So for the JOIN to base tables, the optimizer opts for base table scans and parallelism.

    Yes, but the article only compares inline TVF with multi-line TVF. It does not compare inline TVF to temp tables or table variables.

    Also, the TVF used is based on an actual table rather than a parameter that is passed in.

    The author's recommendation to use inline TVF when possible is based on these two assumptions, neither of which is valid in your case.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Quick suggestion, have a look at this article[/url].

    😎

  • regan.wick (8/2/2016)


    I have a specific question about JOINing results of an in-line split function to base tables

    I am using an in-line TVF and joining to base tables. Sample code below.

    The problem is the optimizer does not know how many records the TVF returns and so ends up with base table scans and parallelism.

    If I INSERT the results of the TVP into a table var then JOIN on that table var, all is well b/c the optimizer has a count --- but that defeats the point of an in-line TVF.

    I did some research on this but am not seeing much discussion.

    Any ideas much appreciated.

    SELECT

    MyBaseTable.KeyColumn

    ,MyBaseTable.Col1

    ,MyBaseTable.Col2

    FROM

    MyBaseTable

    INNER JOIN dbo.fnListToTable8K_InLine(@List,',') ListAsTable

    ON ListAsTable.ListItem = MyBaseTable.KeyColumn

    Here's an idea - there's something way wrong with your model:

    -- Table PaymentCard has 60 million rows

    SELECT f.ID, ds.Item

    FROM dbo.PaymentCard f

    INNER JOIN [dbo].[il_SplitDelimitedStringArray]

    ('THE,QUICK,BROWN,FOX,JUMPED,OVER,THE,LAZY,DOG',',') ds

    ON ds.ItemNumber = f.ID

    -- 18 rows estimated, 9 rows actual from iTVF

    -- Nested loops join to PaymentCard, clustered index seek

    -- (Estimated number of executions = 18, number of executions = 9)

    SELECT f.ID, ds.Item

    FROM dbo.PaymentCard f

    INNER JOIN [dbo].[il_SplitDelimitedStringArray]

    ('THE,QUICK,BROWN,FOX,JUMPED,OVER,THE,LAZY,DOG,AND,RAN,OFF,INTO,THE,SUNSET',',') ds

    ON ds.ItemNumber = f.ID

    -- 26 rows estimated, 15 rows actual from iTVF

    -- Nested loops join to PaymentCard, clustered index seek

    -- (Estimated number of executions = 26, number of executions = 15)

    Edit: probably an implicit conversion. Try this:

    SELECT

    MyBaseTable.KeyColumn

    ,MyBaseTable.Col1

    ,MyBaseTable.Col2

    FROM MyBaseTable

    INNER JOIN dbo.fnListToTable8K_InLine(@List,',') ListAsTable

    ON CAST(ListAsTable.ListItem AS VARCHAR(20)) = MyBaseTable.KeyColumn

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • What would also help to confirm what is happening to see the execution plan, actual would be preferred.

  • ChrisM@Work-

    You nailed it. Thx!

    Needed to CAST the delimited values to the corresponding base table column data types.

    I hope to return a favor some day.

    Thanks again

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply