Propagating Data Fowards Through Time

  • Hi SSC,

    I have a table containing prices for certain stock prices, and dates upon which their shares outstanding changed. I need to be able to take the dates upon which the Shares Outstanding changed, and propagate that value forwards in time until the next observed change, then propagate that one. The end result should be that for every data point (except those prior to the first observable change in Shares Outstanding) each record should have a value for SharesOutstanding on each date. This is how I've written it right now, but the universe is about 5 million data points, and this query performs very slowly. Does anyone know of a more efficient way to perform this sort of operation?

    set nocount on

    set statistics io off

    --Setting up the mock universe

    if object_id('tempdb.dbo.#Stocks') is not null drop table #Stocks

    create table #Stocks

    (

    FundID varchar(10),

    TradingDate int, --we use MSDates

    SharesOutstanding int,

    SharesOutstandingChangeDate int

    primary key clustered (FundID, TradingDate)

    )

    declare

    @loop int,

    @Price float

    select @loop = 39000

    while @Loop <= 40000

    begin

    insert into #Stocks (FundID, TradingDate)

    select 'AAA', @loop union all

    select 'BBB', @loop

    select @Loop = @Loop + 1

    end

    update #Stocks

    set

    SharesOutstanding = case FundID when 'AAA' then case TradingDate when 39002 then 40000

    when 39761 then 80000

    when 39987 then 70000

    end

    when 'BBB' then case TradingDate when 39127 then 125

    when 39493 then 305

    end

    end,

    SharesOutstandingChangeDate = case FundID when 'AAA' then case when TradingDate in (39002, 39761, 39987) then TradingDate

    else null

    end

    when 'BBB' then case when TradingDate in (39127, 39493) then TradingDate

    else null

    end

    end

    --this is the statement I'm really looking to improve upon.

    set statistics io on

    ;with Propagate as (select

    TradingDate = b.TradingDate,

    SharesOutstanding = a.SharesOutstanding,

    AsOfDate = b.AsOfDate

    from #Stocks a

    inner join (select

    TradingDate = s2.TradingDate,

    AsOfDate = max(s1.SharesOutstandingChangeDate)

    from #Stocks s1

    inner join #Stocks s2

    on s1.SharesOutstandingChangeDate <= s2.TradingDate

    group by s2.TradingDate) b

    on a.TradingDate = b.AsOfDate)

    update s

    set SharesOutstanding = p.SharesOutstanding

    from #Stocks s

    inner join Propagate p

    on s.TradingDate = p.TradingDate

    Thanks!

    Executive Junior Cowboy Developer, Esq.[/url]

  • Hi Gabe

    Not sure if this is the logic you want to apply, but you may want to add join clauses for the FundID. Should improve it considerably. I have bolded the changes I made

    ;with Propagate as (

    select

    TradingDate = b.TradingDate,

    SharesOutstanding = a.SharesOutstanding,

    AsOfDate = b.AsOfDate,

    FundID = a.FundID

    from #Stocks a

    inner join (

    select

    FundID = s2.FundID,

    TradingDate = s2.TradingDate,

    AsOfDate = max(s1.SharesOutstandingChangeDate)

    from #Stocks s1

    inner join #Stocks s2 on s1.SharesOutstandingChangeDate <= s2.TradingDate

    and s1.FundID = s2.FundID

    group by s2.FundID, s2.TradingDate

    ) b on a.TradingDate = b.AsOfDate

    and a.FundID = b.FundID

    )

    --update s

    --set SharesOutstanding = p.SharesOutstanding

    select *

    from #Stocks s

    inner join Propagate p

    on s.TradingDate = p.TradingDate

    and s.FundID = p.FundID

  • Gabe,

    This sounds exactly like the problem posted here (just recently): http://www.sqlservercentral.com/Forums/Topic1369495-391-1.aspx

    You'll find that the SSC team of volunteers got into a bit of a performance contest over the best way to approach it. Perhaps you'll find some tips in there.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks guys, those look like good places to start. I'll check them out!

    Executive Junior Cowboy Developer, Esq.[/url]

  • Seems like a very unwieldy approach. Why can't you just store the starting and ending dates for each price point?

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply