Doing something like an Excel FillDown function in SQL looking for suggestions???

  • I have a recordset that returns the following data:

    ItemNumberDisplaySeqNum

    101

    2

    3

    4

    205

    6

    7

    8

    309

    10

    11

    It's like an invoice sheet where there's multiple parts to each item number. There's no other relationship between the parts and the item number other then the fact that when ordered by the display sequence the item number that's first applies to all parts that do not have an item number until the next item number is filled. (programmers). Anyways I've worked out a solution using CTE but wanted to know if there's a better way to write it. I'm avoiding cursors or other loops. The result is shown below. Potentially there could be hundreds maybe thousands of parts and performance may be an issue. Thanks in advance to anyone who replies. 🙂

    ItemNumberDisplaySeqNum

    101

    102

    103

    104

    205

    206

    207

    208

    309

    3010

    3011

    if object_id('tempdb.dbo.#TempTable') is not null

    drop table #TempTable

    create table #TempTable(ItemNumber varchar(10) not null, DisplaySeqNum int not null)

    insert into #TempTable

    values('10',1),('',2),('',3),('',4),('20',5),('',6),('',7),('',8),('30',9),('',10),('',11)

    select *

    from #TempTable

    ;with s1 as (

    select scqli.ItemNumber

    ,scqli.DisplaySeqNum

    from #TempTable scqli

    ),

    s2 as (

    select ItemNumber

    ,DisplaySeqNum

    ,ROW_NUMBER()over(order by ItemNumber) RowNum

    from s1

    where ItemNumber <> ''

    ),

    s3 as (

    select sd.ItemNumber

    ,sc.DisplaySeqNum

    from s1 sc

    inner join (

    select sa.ItemNumber

    ,sa.DisplaySeqNum StartLine

    ,isnull(sb.DisplaySeqNum-1,100000) EndLine

    from s2 sa

    left join (select ItemNumber, DisplaySeqNum, RowNum - 1 RowNum from s2) sb

    on sa.RowNum = sb.RowNum

    ) sd

    on sc.DisplaySeqNum between sd.StartLine and sd.EndLine

    )

    select *

    from s3

    ---------------------------------------------------------------
    Mike Hahn - MCSomething someday:-)
    Right way to ask for help!!
    http://www.sqlservercentral.com/articles/Best+Practices/61537/
    I post so I can see my avatar :hehe:
    I want a personal webpage 😎
    I want to win the lotto 😀
    I want a gf like Tiffa :w00t: Oh wait I'm married!:-D

  • What you have there is a spreadsheet, not a database table. Rows in a database table don't have a sequence to them, so there's no way for the database to know that "the second row" should have the same value as "the first row". As far as the database is concerned, there is no "first row" or "second row". Those are things that humans looking at the data add in, based on our own mental prejudices.

    I know that sounds weird, and it's one of the hardest things for new DBAs to wrap their heads around (along with how join math works, that's the one I usually finds kicks people's feet out from under them them worst). But it's true.

    It's one of the key differences in how you approach databases vs how you approach spreadsheets.

    What you're trying to do isn't something that SQL Server (or Oracle, DB2, MySQL, etc.) is designed to do. You won't be able to count on the results working correctly because of that. It may look right 10 times in a row, but you can't count on the 11th time being right. So don't rely on it at all.

    Get the devs to fix how they are entering the data. That's the only valid solution.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/12/2012)


    What you have there is a spreadsheet, not a database table. Rows in a database table don't have a sequence to them, so there's no way for the database to know that "the second row" should have the same value as "the first row". As far as the database is concerned, there is no "first row" or "second row". Those are things that humans looking at the data add in, based on our own mental prejudices.

    I know that sounds weird, and it's one of the hardest things for new DBAs to wrap their heads around (along with how join math works, that's the one I usually finds kicks people's feet out from under them them worst). But it's true.

    It's one of the key differences in how you approach databases vs how you approach spreadsheets.

    What you're trying to do isn't something that SQL Server (or Oracle, DB2, MySQL, etc.) is designed to do. You won't be able to count on the results working correctly because of that. It may look right 10 times in a row, but you can't count on the 11th time being right. So don't rely on it at all.

    Get the devs to fix how they are entering the data. That's the only valid solution.

    Yeah I understand it's not a DB function and I completely agree with you that the developers should fix it but I'm on contract for a year to do some SSRS/SSAS development work and a program change isn't an option. I'm sure DBA's would lose their stomaches if they saw this database. A database developed by old school VB6 programmers... with very little restrictions... wow... (not bashing programmers, I too started out as a C++/VB programmer about 20 years ago)

    ---------------------------------------------------------------
    Mike Hahn - MCSomething someday:-)
    Right way to ask for help!!
    http://www.sqlservercentral.com/articles/Best+Practices/61537/
    I post so I can see my avatar :hehe:
    I want a personal webpage 😎
    I want to win the lotto 😀
    I want a gf like Tiffa :w00t: Oh wait I'm married!:-D

  • Want a cool sig (11/12/2012)


    I have a recordset that returns the following data:

    ItemNumberDisplaySeqNum

    101

    2

    3

    4

    205

    6

    7

    8

    309

    10

    11

    It's like an invoice sheet where there's multiple parts to each item number. There's no other relationship between the parts and the item number other then the fact that when ordered by the display sequence the item number that's first applies to all parts that do not have an item number until the next item number is filled. (programmers). Anyways I've worked out a solution using CTE but wanted to know if there's a better way to write it. I'm avoiding cursors or other loops. The result is shown below. Potentially there could be hundreds maybe thousands of parts and performance may be an issue. Thanks in advance to anyone who replies. 🙂

    ItemNumberDisplaySeqNum

    101

    102

    103

    104

    205

    206

    207

    208

    309

    3010

    3011

    if object_id('tempdb.dbo.#TempTable') is not null

    drop table #TempTable

    create table #TempTable(ItemNumber varchar(10) not null, DisplaySeqNum int not null)

    insert into #TempTable

    values('10',1),('',2),('',3),('',4),('20',5),('',6),('',7),('',8),('30',9),('',10),('',11)

    select *

    from #TempTable

    ;with s1 as (

    select scqli.ItemNumber

    ,scqli.DisplaySeqNum

    from #TempTable scqli

    ),

    s2 as (

    select ItemNumber

    ,DisplaySeqNum

    ,ROW_NUMBER()over(order by ItemNumber) RowNum

    from s1

    where ItemNumber <> ''

    ),

    s3 as (

    select sd.ItemNumber

    ,sc.DisplaySeqNum

    from s1 sc

    inner join (

    select sa.ItemNumber

    ,sa.DisplaySeqNum StartLine

    ,isnull(sb.DisplaySeqNum-1,100000) EndLine

    from s2 sa

    left join (select ItemNumber, DisplaySeqNum, RowNum - 1 RowNum from s2) sb

    on sa.RowNum = sb.RowNum

    ) sd

    on sc.DisplaySeqNum between sd.StartLine and sd.EndLine

    )

    select *

    from s3

    Thank you for taking the time to setup some test data. It makes it a whole lot easier for folks to try to help.

    I don't know what other folks call these typess of things but I call them "data smears" because you have to "smear" the data down from one row to the next.

    The following code will do what you want with the data you've so nicely provided and, if you have the right indexing, will be lightning quick.

    UPDATE tt

    SET ItemNumber = ca.ItemNumber

    FROM #TempTable tt

    CROSS APPLY (SELECT TOP 1 ItemNumber

    FROM #TempTable ttca

    WHERE ttca.DisplaySeqNum <= tt.DisplaySeqNum

    AND ttca.ItemNumber > ''

    ORDER BY ttca.DisplaySeqNum DESC) ca

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff I'll give it a try tomorrow, I'm not very familiar with cross apply very much. 🙂

    ---------------------------------------------------------------
    Mike Hahn - MCSomething someday:-)
    Right way to ask for help!!
    http://www.sqlservercentral.com/articles/Best+Practices/61537/
    I post so I can see my avatar :hehe:
    I want a personal webpage 😎
    I want to win the lotto 😀
    I want a gf like Tiffa :w00t: Oh wait I'm married!:-D

  • Want a cool sig (11/12/2012)


    Thanks Jeff I'll give it a try tomorrow, I'm not very familiar with cross apply very much. 🙂

    Cross Apply is fairly easy. It's just a correlated subquery that's very similar to the sub-query you might do in a SELECT list except that it can return more than one value. There's also an OUTER Apply that works like a correlated subquery with a left outer join.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wow Jeff that is wicked fast... thanks a lot... the time for returning 2600 rows really shows how much faster your method is...

    Jeff's wicked fast script

    SQL Server Execution Times:

    CPU time = 563 ms, elapsed time = 682 ms.

    My wicked slow script

    SQL Server Execution Times:

    CPU time = 25500 ms, elapsed time = 25696 ms.

    ---------------------------------------------------------------
    Mike Hahn - MCSomething someday:-)
    Right way to ask for help!!
    http://www.sqlservercentral.com/articles/Best+Practices/61537/
    I post so I can see my avatar :hehe:
    I want a personal webpage 😎
    I want to win the lotto 😀
    I want a gf like Tiffa :w00t: Oh wait I'm married!:-D

  • Want a cool sig (11/13/2012)


    Wow Jeff that is wicked fast... thanks a lot... the time for returning 2600 rows really shows how much faster your method is...

    Jeff's wicked fast script

    SQL Server Execution Times:

    CPU time = 563 ms, elapsed time = 682 ms.

    My wicked slow script

    SQL Server Execution Times:

    CPU time = 25500 ms, elapsed time = 25696 ms.

    Ugh! I appreciate the compliment but more than 1/2 a second to process just 2600 rows is pretty bad for me. The right kind of index might help it, though.

    I say it's slow because a "Quirky Update" would do such a thing to a million rows in about 2 or 3 seconds. If you have a large nnumber of rows, let me know and we'll see what we can put together for you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/14/2012)


    Want a cool sig (11/13/2012)


    Wow Jeff that is wicked fast... thanks a lot... the time for returning 2600 rows really shows how much faster your method is...

    Jeff's wicked fast script

    SQL Server Execution Times:

    CPU time = 563 ms, elapsed time = 682 ms.

    My wicked slow script

    SQL Server Execution Times:

    CPU time = 25500 ms, elapsed time = 25696 ms.

    Ugh! I appreciate the compliment but more than 1/2 a second to process just 2600 rows is pretty bad for me. The right kind of index might help it, though.

    I say it's slow because a "Quirky Update" would do such a thing to a million rows in about 2 or 3 seconds. If you have a large nnumber of rows, let me know and we'll see what we can put together for you.

    it's part of a larger query that hits a table with 530k records and joins 5 other tables... the 2600 records is the largest recordset that I can find in the table... and unfortunately the columns I'm using isn't indexed... I'm happy with 1/2 a second... 🙂 Thanks

    ---------------------------------------------------------------
    Mike Hahn - MCSomething someday:-)
    Right way to ask for help!!
    http://www.sqlservercentral.com/articles/Best+Practices/61537/
    I post so I can see my avatar :hehe:
    I want a personal webpage 😎
    I want to win the lotto 😀
    I want a gf like Tiffa :w00t: Oh wait I'm married!:-D

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply