Calculating interest query

  • Jeff Moden (3/4/2010)


    This is the link John tried to post...

    http://www.imdb.com/title/tt0151804/

    It's a cult classic, totally silly, and (unfortunately) sometimes truer than life in a cube. πŸ˜›

    and of course the main plot element is a scheme to skim fractions of pennies from interest rounding.

  • Paul White NZ (2/27/2010)


    ...recursive CTEs can be fast, but it is relative. I sometimes use them to seek down the distinct keys of a large index rather than scanning the whole thing and relying on an aggregate, for example...

    Paul

    I'd really like to see an example of that - in the meantime, here's another simple and obvious use of rCTE's. No timings because I've not yet had the opportunity.

    -- Rollup rows, concatenating row values into a new column

    DROP TABLE #Test

    CREATE TABLE #Test (TestID INT IDENTITY (1,1), Section INT, Word VARCHAR(20))

    INSERT INTO #Test (Section, Word) VALUES

    (1, 'A'),(1, 'day'), (1, 'in'),(1, 'the'),(1, 'life'),(1, 'of'),(1, 'Ivan'),(1, 'Denisovich'),

    (2, 'Silent'),(2, 'Spring')

    ;WITH PreparedData AS (

    SELECT *,

    SectionID = ROW_NUMBER() OVER(PARTITION BY Section ORDER BY TestID), -- sets order of words in sentence

    SectionSize = COUNT(*) OVER(PARTITION BY Section) -- used to detect last word of sentence

    FROM #Test),

    Concatenator AS (

    SELECT TestID, Section, Word, -- source columns

    SectionID, SectionSize, Sentence = CAST(Word AS VARCHAR(100)) -- "working" columns

    FROM PreparedData

    WHERE TestID = 1

    UNION ALL

    SELECT

    t.TestID, t.Section, t.Word,

    t.SectionID, t.SectionSize, Sentence = CAST(CASE WHEN t.SectionID = 1 THEN ISNULL(t.Word, '')

    ELSE l.Sentence + ' ' + ISNULL(t.Word, '') END AS VARCHAR(100))

    FROM PreparedData t

    INNER JOIN Concatenator l ON l.TestID + 1 = t.TestID

    )

    SELECT

    TestID,

    Section,

    Sentence

    FROM Concatenator

    WHERE SectionID = SectionSize

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (10/29/2010)


    I'd really like to see an example of that

    I'll post one in a minute.

    ...in the meantime, here's another simple and obvious use of rCTE's. No timings because I've not yet had the opportunity.

    Table Spool mania! It doesn't perform well; I'd stick with the usual XML hack/solution:

    SELECT Sections.Section,

    Concatenated.sentence

    FROM (

    SELECT DISTINCT

    Section

    FROM #Test

    ) Sections

    CROSS

    APPLY (

    SELECT SPACE(1) + Word

    FROM #Test T

    WHERE T.Section = Sections.Section

    ORDER BY

    T.TestID

    FOR XML PATH (''),

    TYPE

    ) Concatenator (xml_string)

    CROSS

    APPLY (

    SELECT STUFF(Concatenator.xml_string.value('(./text())[1]', 'VARCHAR(MAX)'), 1, 1, SPACE(0))

    ) Concatenated (sentence);

  • Super-fast DISTINCT using a recursive CTE:

    USE tempdb;

    GO

    DROP TABLE dbo.Test;

    GO

    CREATE TABLE

    dbo.Test

    (

    data INTEGER NOT NULL,

    );

    GO

    CREATE CLUSTERED INDEX c ON dbo.Test (data);

    GO

    -- Lots of duplicated values

    INSERT dbo.Test WITH (TABLOCK)

    (data)

    SELECT TOP (5000000)

    ROW_NUMBER() OVER (ORDER BY (SELECT 0)) / 117329

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3;

    GO

    SET STATISTICS TIME ON;

    -- 1591ms CPU

    SELECT DISTINCT

    data

    FROM dbo.Test;

    -- 15ms CPU

    WITH RecursiveCTE

    AS (

    SELECT data = MIN(T.data)

    FROM dbo.Test T

    UNION ALL

    SELECT R.data

    FROM (

    -- A cunning way to use TOP in the recursive part of a CTE :)

    SELECT T.data,

    rn = ROW_NUMBER() OVER (ORDER BY T.data)

    FROM dbo.Test T

    JOIN RecursiveCTE R

    ON R.data < T.data

    ) R

    WHERE R.rn = 1

    )

    SELECT *

    FROM RecursiveCTE

    OPTION (MAXRECURSION 0);

    SET STATISTICS TIME OFF;

    GO

    DROP TABLE dbo.Test;

    The recursive CTE is 100 times more efficient πŸ™‚

  • Paul White NZ (10/29/2010)


    Super-fast DISTINCT using a recursive CTE:

    Absolutely awesome! What on earth made you look for an alternative to DISTINCT?

    On my 8 year old 1.8GHz P4, here's what I get...

    (43 row(s) affected)

    SQL Server Execution Times:

    CPU time = 2938 ms, elapsed time = 3052 ms.

    (43 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 3 ms.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/31/2010)


    Absolutely awesome! What on earth made you look for an alternative to DISTINCT?

    Thanks. It's a fair while ago now, so I don't remember exactly what the circumstances were, but for sure there was a column with very many duplicate values and it just seemed dumb to me that the optimizer would choose to scan the whole index rather than trying something a little more creative. I think it was around the time I was writing my article on paging, so it might have been at least tangentially related to that.

  • Damn! There is a trade off. If you don't know there's going to be dupes or not, pray for dupes when you use it. πŸ™‚

    USE tempdb;

    GO

    DROP TABLE dbo.Test;

    GO

    CREATE TABLE

    dbo.Test

    (

    data INTEGER NOT NULL,

    );

    GO

    CREATE CLUSTERED INDEX c ON dbo.Test (data);

    GO

    -- Lots of duplicated values

    INSERT dbo.Test WITH (TABLOCK)

    (data)

    SELECT TOP (1000000) --CUT DOWN TO A MILLION FOR THIS TEST

    ROW_NUMBER() OVER (ORDER BY (SELECT 0)) --NO DUPES

    FROM master.sys.all_columns C1,

    master.sys.all_columns C2

    GO

    ------------------------------------------------------------------

    DECLARE @Bitbucket INT --TO TAKE THE DISPLAY PROCESSING OUT OF THE PICTURE

    SET STATISTICS TIME ON;

    -- 1591ms CPU

    SELECT DISTINCT

    @Bitbucket = data

    FROM dbo.Test;

    -- 15ms CPU

    WITH RecursiveCTE

    AS (

    SELECT data = MIN(T.data)

    FROM dbo.Test T

    UNION ALL

    SELECT R.data

    FROM (

    -- A cunning way to use TOP in the recursive part of a CTE :)

    SELECT T.data,

    rn = ROW_NUMBER() OVER (ORDER BY T.data)

    FROM dbo.Test T

    JOIN RecursiveCTE R

    ON R.data < T.data

    ) R

    WHERE R.rn = 1

    )

    SELECT @Bitbucket = data

    FROM RecursiveCTE

    OPTION (MAXRECURSION 0);

    SET STATISTICS TIME OFF;

    GO

    DROP TABLE dbo.Test;

    Still, if you know you have a lot of dupes, this is the berries! πŸ™‚

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul White NZ (10/29/2010)


    ...in the meantime, here's another simple and obvious use of rCTE's. No timings because I've not yet had the opportunity.[/quote-0]

    Table Spool mania! It doesn't perform well; I'd stick with the usual XML hack/solution:

    Heh - until you wrap it up nice and warm:

    -- make some simple sample data, 800,000 rows

    DROP TABLE #Sections

    SELECT TOP 100000 Section = ABS(CHECKSUM(NEWID()))

    INTO #Sections

    FROM dbo.syscolumns a, dbo.syscolumns b

    DROP TABLE #Test

    CREATE TABLE #Test (TestID INT IDENTITY (1,1), Section INT, Word VARCHAR(20))

    INSERT INTO #Test (Section, Word)

    SELECT s.Section, d1.Word

    FROM #Sections s,

    (

    SELECT 'A' AS Word UNION ALL

    SELECT 'day' UNION ALL

    SELECT 'in' UNION ALL

    SELECT 'the' UNION ALL

    SELECT 'life' UNION ALL

    SELECT 'of' UNION ALL

    SELECT 'Ivan' UNION ALL

    SELECT 'Denisovich'

    ) d1

    ORDER BY s.Section

    -- prepare the data for use - part of the solution

    DROP TABLE #PreparedData

    SELECT

    TestID,

    Section,

    Word,

    WordStatus = CAST(CASE WHEN SectionID = 1 THEN 1 WHEN SectionID = SectionSize THEN 9 ELSE 0 END AS TINYINT)

    INTO #PreparedData

    FROM (

    SELECT *,

    SectionID = ROW_NUMBER() OVER(PARTITION BY Section ORDER BY TestID), -- sets order of words in sentence

    SectionSize = COUNT(*) OVER(PARTITION BY Section) -- used to detect last word of sentence

    FROM #Test

    ) d

    CREATE UNIQUE CLUSTERED INDEX CITestID ON #PreparedData ([TestID] ASC)

    -- 6 seconds

    -- consume the prepared data

    ;WITH Concatenator AS (

    SELECT TestID, Section, Word, -- source columns

    WordStatus, Sentence = CAST(Word AS VARCHAR(100)) -- "working" columns

    FROM #PreparedData

    WHERE TestID = 1

    UNION ALL

    SELECT

    t.TestID, t.Section, t.Word,

    t.WordStatus, Sentence = CAST(CASE WHEN t.WordStatus = 1 THEN ISNULL(t.Word, '')

    ELSE l.Sentence + ' ' + ISNULL(t.Word, '') END AS VARCHAR(100))

    FROM #PreparedData t

    INNER JOIN Concatenator l ON l.TestID + 1 = t.TestID

    )

    SELECT

    TestID,

    Section,

    Sentence

    FROM Concatenator

    WHERE WordStatus = 9

    OPTION (MAXRECURSION 0)

    -- 32 seconds: 100,000 (ish) rollup-up output rows

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Jeff Moden (11/1/2010)


    Damn! There is a trade off. If you don't know there's going to be dupes or not, pray for dupes when you use it.... Still, if you know you have a lot of dupes, this is the berries! πŸ™‚

    Yes, absolutely.

  • Jeff Moden (11/1/2010)


    Damn! There is a trade off. If you don't know there's going to be dupes or not, pray for dupes when you use it. πŸ™‚

    USE tempdb;

    GO

    DROP TABLE dbo.Test;

    GO

    CREATE TABLE

    dbo.Test

    (

    data INTEGER NOT NULL,

    );

    GO

    CREATE CLUSTERED INDEX c ON dbo.Test (data);

    GO

    -- Lots of duplicated values

    INSERT dbo.Test WITH (TABLOCK)

    (data)

    SELECT TOP (1000000) --CUT DOWN TO A MILLION FOR THIS TEST

    ROW_NUMBER() OVER (ORDER BY (SELECT 0)) --NO DUPES

    FROM master.sys.all_columns C1,

    master.sys.all_columns C2

    GO

    ------------------------------------------------------------------

    DECLARE @Bitbucket INT --TO TAKE THE DISPLAY PROCESSING OUT OF THE PICTURE

    SET STATISTICS TIME ON;

    -- 1591ms CPU

    SELECT DISTINCT

    @Bitbucket = data

    FROM dbo.Test;

    -- 15ms CPU

    WITH RecursiveCTE

    AS (

    SELECT data = MIN(T.data)

    FROM dbo.Test T

    UNION ALL

    SELECT R.data

    FROM (

    -- A cunning way to use TOP in the recursive part of a CTE :)

    SELECT T.data,

    rn = ROW_NUMBER() OVER (ORDER BY T.data)

    FROM dbo.Test T

    JOIN RecursiveCTE R

    ON R.data < T.data

    ) R

    WHERE R.rn = 1

    )

    SELECT @Bitbucket = data

    FROM RecursiveCTE

    OPTION (MAXRECURSION 0);

    SET STATISTICS TIME OFF;

    GO

    DROP TABLE dbo.Test;

    Still, if you know you have a lot of dupes, this is the berries! πŸ™‚

    No kidding. 100-fold improvement is astonishing.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (11/1/2010)


    Heh - until you wrap it up nice and warm...

    That helps - as usual - but it's still a bit slower than the (unmodified) XML PATH solution I posted earlier.

    My version also benefits from a suitable clustered index: CREATE UNIQUE CLUSTERED INDEX cuq ON #Test (Section, TestID). That makes it much faster, at least on my machine.

    Both solutions suck a bit because they can only use a single thread - can you guess how I'd fix that? πŸ˜‰

  • Paul White NZ (11/1/2010)


    Chris Morris-439714 (11/1/2010)


    Heh - until you wrap it up nice and warm...

    That helps - as usual - but it's still a bit slower than the (unmodified) XML PATH solution I posted earlier.

    My version also benefits from a suitable clustered index: CREATE UNIQUE CLUSTERED INDEX cuq ON #Test (Section, TestID). That makes it much faster, at least on my machine.

    Both solutions suck a bit because they can only use a single thread - can you guess how I'd fix that? πŸ˜‰

    Nah mate my walking-on-water skills are still underwhelming - don't even drink the stuff, fish f...never mind. Spill the beans! I'll try to understand, promise!

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (11/1/2010)


    Nah mate my walking-on-water skills are still underwhelming - don't even drink the stuff, fish f...never mind. Spill the beans! I'll try to understand, promise!

    SQLCLR, naturally - using Adam's Query Parallelizer πŸ™‚

  • Paul White NZ (11/1/2010)


    Chris Morris-439714 (11/1/2010)


    Nah mate my walking-on-water skills are still underwhelming - don't even drink the stuff, fish f...never mind. Spill the beans! I'll try to understand, promise!

    SQLCLR, naturally - using Adam's Query Parallelizer πŸ™‚

    Ooh you cheat, that's not a fix that's a different method altogether πŸ˜‰

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (11/1/2010)


    Ooh you cheat, that's not a fix that's a different method altogether πŸ˜‰

    Yes I do cheat. Proud of it! :laugh:

Viewing 15 posts - 46 through 60 (of 67 total)

You must be logged in to reply to this topic. Login to reply