Stored Proc And View Giving Different Results

  • I have been working on a project to establish how many people met certain conditions on any given day of a month to create a dashboard. The dashboard includes counts and averages of the numbers grouped by month and another condition. I'd written a proc to do this and my boss suggested that I use the proc to create a view because the data will be useful in other places later. I created the view by copying the code from my proc, removing any grouping and ordering, adding a couple of date columns to it and writing 'create view X as' on the top. I then made a copy of my dashboard proc, deleted the code from it and pointed it the view. In theory, the view should do exactly the same as the proc code, because it was the proc code. The problem is that the counts are coming back different and I've no idea how. When I run the view code it returns just over 180000 rows and when I run the proc without the grouping it returns exactly the same number of rows. The counts are all in the same general range but not the same. Has anybody got any suggestions as to why this could be happening or obvious things to check first.

    I'll get some code posted as soon as I can, I'm just not sure how best to do it. I can't think how to post sample data because there's quite a few tables that need joining to create the view. While I know what correct data looks like, I'm not sure if 'correct data' will replicate the problem. I think it's either something really obvious or something very in depth that a relative newbie like myself wouldn't ever think of.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Going to need to see the code. Can you post just the procedure and view initially, might need more later.

    I created the view by copying the code from my proc, removing any grouping and ordering, adding a couple of date columns to it and writing 'create view X as' on the top. I then made a copy of my dashboard proc, deleted the code from it and pointed it the view. In theory, the view should do exactly the same as the proc code, because it was the proc code.

    You said you removed the grouping and ordering? If you removed a group by clause, then the view code is no longer the same as the procedure code and could well return different results.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I've attached the code now. I've had to do a bit of obfuscation, which I hope doesn't hide the logic. The organisation I work for is quite high profile and security conscious so I'm trying not to give too much away.

    With regards to the liberal application of nolocks, what can I say. My boss is a firm believer in them and when don't put them in it's the first thing he mentions. I know they are potentially a bad thing and I'm looking for a cast iron case I can put to him about why we shouldn't rely on them. It would be nice if they were the cause of this little problem.

    Edit Grammar


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Could you rather put the code in your post, wrapped in the code tags (see left-hand panel)

    Word docs from untrusted sources.... 🙁

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Fair enough, although I'm a little hurt you don't trust me.... 🙂

    The code to create the dashboard with the proc

    set transaction isolation level read uncommitted

    set dateformat ymd

    select distinct

    OrderType =

    case pO.ordertypedesciptionid

    when 'Bilge Rat' then 'BR'

    when 'Bilge Rat - Long Tail' then 'BR - Long Tail'

    when 'Scurvy Dog' then 'SD'

    when 'Jolly Jack Tar' then 'JTT'

    else pO.ordertypedesciptionid

    end

    ,dd.YearMonth

    ,dd.MonthDateOfYear

    ,dd.DateYear

    ,ServiceDays = COUNT(1)

    ,AverageServiceDays = COUNT(1)/(select COUNT(1) from dbo.D_Date d2 where d2.YearMonth = dd.YearMonth)

    ,NumberOfDaysInCurrentMonth = (select COUNT(1) from dbo.D_Date d2 where d2.YearMonth = dd.YearMonth)

    from

    dbo.pirateorder po (nolock)

    join dbo.vwSwordAllocation sa (nolock) on sa.OrderID = po.orderno_int

    JOIN dbo.piratescontacts PC (nolock) on PC.contactid = po.subjectid

    JOIN dbo.serviceatsea sas (nolock) on sas.pirateorderid_lk = po.pirateorderid

    join dbo.D_Date dd (nolock) on dbo.timetoUTC(po.OrderStartDate) <= dd.fulldate

    and dbo.timetoUTC(po.OrderEndDate_dt) >= dd.FullDate

    cross apply

    (

    select

    minH = (

    select Min(cast(hff.HourStartDate as DATE))

    from dbo.vwHoursbeforethemast hff (nolock)

    where po.orderno_int = hff.orderid

    )

    ,

    maxH = (

    select Max(cast(hff.HourEndDate as DATE))

    from dbo.vwHoursbeforethemast hff (nolock)

    where po.orderno_int = hff.orderid

    )

    ) mH

    where

    pc.greenhorn = 0

    and pc.lastname not like 'Test%'

    and pc.firstname not like 'test%'

    and pc.excludefromrations <> 1

    and po.statusreason <> 3 --'void'

    and dd.FullDate >= cast(sa.SwordIssued as date)

    and dd.FullDate <= isnull(cast(sa.Swordreturned as date),sa.OrderEnd)

    and dd.FullDate >= cast(sas.servicestartdate as DATE)

    and dd.FullDate <= cast(sas.serviceenddate as date)

    and dd.FullDate >= mh.minH

    and dd.FullDate <= mh.maxH

    and dd.FullDate <= GETDATE()

    group by dd.YearMonth,pO.ordertypedesciptionid,dd.MonthDateOfYear,dd.DateYear

    order by dd.YearMonth,dd.DateYear

    The code to create the view.

    Create view [dbo].[vw_DreadPirates] as

    select distinct

    Orderno = po.orderno_int

    ,SubjectID = pc.subjectno

    ,OrderType = pO.ordertypedesciptionid

    ,OrderStartDate = po.OrderStartDate

    ,OrderEndDate = po.OrderEndDate

    ,po.CreatedOn

    ,dd.FullDate

    ,Monitored = 1

    ,dd.DayOfWeek

    ,dd.DayNumberOfWeek

    ,dd.YearMonth

    ,dd.DateOfMonth

    ,dd.MonthOfYear

    ,dd.MonthDateOfYear

    ,dd.DateYear

    from

    dbo.pirateorder po (nolock)

    join dbo.vwSwordAllocation sa (nolock) on sa.OrderID = po.orderno_int

    JOIN dbo.piratescontacts pC (nolock) on pC.contactid = so.subjectid

    JOIN dbo.serviceatsea sas (nolock) on sas.pirateorderid_lk = po.pirateorderid

    join mibi.dbo.D_Date dd (nolock) on dbo.TimeToUTC(po.OrderStartDate) <= dd.fulldate

    and dbo.TimetoUTC(po.OrderEndDate) >= dd.FullDate

    cross apply

    (

    select

    minH = (

    select Min(cast(hff.HourStartDate as DATE))

    from dbo.vwHoursbeforethemast hff (nolock)

    where po.orderno_int = hff.orderid

    )

    ,

    maxH = (

    select Max(cast(hff.HourEndDate as DATE))

    from .dbo.vwHoursbeforethemast hff (nolock)

    where po.orderno_int = hff.orderid

    )

    ) mH

    where

    pc.greenhorn = 0

    and pc.lastname not like 'Test%'

    and pc.firstname not like 'test%'

    and pc.excludefromrations <> 1

    and po.statusreason <> 3 --'void'

    and dd.FullDate >= cast(sa.swordissued as date)

    and dd.FullDate <= isnull(cast(sa.Swordreturned as date),sa.OrderEnd)

    and dd.FullDate >= cast(sas.servicestartdate as DATE)

    and dd.FullDate <= cast(sas.vf_serviceenddate as date)

    and dd.FullDate >= mh.minH

    and dd.FullDate <= mh.maxH

    and dd.FullDate <= GETDATE()

    And last, the code that creates the dashboard using the view.

    set transaction isolation level read uncommitted

    set dateformat ymd

    ;

    select distinct

    OrderType =

    case pO.ordertypedesciptionid

    when 'Bilge Rat' then 'BR'

    when 'Bilge Rat - Long Tail' then 'BR - Long Tail'

    when 'Scurvy Dog' then 'SD'

    when 'Jolly Jack Tar' then 'JTT'

    else pO.ordertypedesciptionid

    end

    ,mp.YearMonth

    ,mp.MonthDateOfYear

    ,mp.DateYear

    ,ServiceDays = COUNT(1)

    ,AverageServiceDays = COUNT(1)/(select COUNT(1) from dbo.D_Date d2 where d2.YearMonth = mp.YearMonth)

    ,NumberOfDaysInCurrentMonth = (select COUNT(1) from dbo.D_Date d2 where d2.YearMonth = mp.YearMonth)

    from

    dbo.vw_DreadPirates mp

    group by mp.YearMonth,mp.OrderType,mp.MonthDateOfYear

    ,mp.DateYear

    order by mp.YearMonth

    ,mp.DateYear


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Can't see anything obvious (other than multiple unnecessary DISTINCTs)

    Is the difference in rowcount consistent?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/31/2014)


    Can't see anything obvious (other than multiple unnecessary DISTINCTs)

    Is the difference in rowcount consistent?

    If you mean 'is the difference in ServiceDays and AverageServiceDays the same every time' then yes. I'd expect some changes in the data for this month because it is altered pretty much constantly but the historic data should stay the same.

    The distincts are to get round an problem where a pirate can have multiple sword allocations and multiple services starting on the same day but they only count as one pirate. The object of the view is one row per active pirate per day. Without the distinct I got one row per pirate per service per sword.

    EDIT Changed some language


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • BWFC (1/31/2014)


    GilaMonster (1/31/2014)


    Can't see anything obvious (other than multiple unnecessary DISTINCTs)

    Is the difference in rowcount consistent?

    If you mean 'is the difference in ServiceDays and AverageServiceDays the same every time' then yes.

    No, I mean the difference in row counts between the procedure and view, the problem that we're trying to solve.

    The distincts are to get round an problem where a pirate can have multiple sword allocations and multiple services starting on the same day but they only count as one pirate. The object of the view is one row per active pirate per day. Without the distinct I got one row per pirate per service per sword.

    Distincts only remove complete duplicate rows, so if there are rows for the same pirate, the same sword and different services, distinct won't remove those. Group by, if you group by the pirate and sword will.

    You don't need distinct at multiple levels, it's a waste, you're telling SQL again and again to remove duplicate rows, if there are really duplicates that need removing (which there won't be with a group by), you need a distinct once only, in the outer-most scope.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • No, I mean the difference in row counts between the procedure and view, the problem that we're trying to solve.

    Of course you did, sorry, conference call brain fry.

    There is no difference in row count between the proc and the view. The problem is, it's a little difficult to compare like with like. I've just run both versions of the dashboard, with the proc and with the view. I added the fulldate and the subjectno columns and then added those to the grouping. This brought back the same row count for both. When I ran the two as they should be, the counts and averages were different. I'm aware that changing the grouping will change the output, but that was the best way I could think of to make sure I was comparing the same things. I've still got a horrible feeling I'm doing something really daft.

    Also, as a result of the conference call, I'm now to include pirates that are excluded from rations. At least that's just a matter of taking out a line in the proc and the view code.

    Distincts only remove complete duplicate rows, so if there are rows for the same pirate, the same sword and different services, distinct won't remove those. Group by, if you group by the pirate and sword will.

    That's what I thought distinct did. When I take out the distinct in the view code I gain about 14000 rows though and they're definitely duplicates. Is it safe to leave the distinct in the view code or am I better to remove it and exclude the duplicate rows another way?

    The main reason that I've got them at multiple levels is that I forgot to remove them.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • You might get some performance lift by jigging around with your filters. If you're going down this route then do take on board Gail's comments about DISTINCT. You might benefit from rolling up one of those child tables in a derived table.

    select distinct

    OrderType =

    case pO.ordertypedesciptionid

    when 'Bilge Rat' then 'BR'

    when 'Bilge Rat - Long Tail' then 'BR - Long Tail'

    when 'Scurvy Dog' then 'SD'

    when 'Jolly Jack Tar' then 'JTT'

    else pO.ordertypedesciptionid

    end

    ,dd.YearMonth

    ,dd.MonthDateOfYear

    ,dd.DateYear

    ,ServiceDays = COUNT(1)

    ,AverageServiceDays = COUNT(1)/dd.NumberOfDaysInCurrentMonth

    ,dd.NumberOfDaysInCurrentMonth

    from dbo.pirateorder po (nolock)

    cross apply (

    select

    minH = Min(cast(hff.HourStartDate as DATE)),

    maxH = Max(cast(hff.HourEndDate as DATE))

    from dbo.vwHoursbeforethemast hff (nolock)

    where po.orderno_int = hff.orderid

    ) mH

    CROSS APPLY (

    select

    NumberOfDaysInCurrentMonth = COUNT(*) OVER(PARTITION BY YearMonth),

    FullDate,

    YearMonth,

    MonthDateOfYear,

    DateYear

    from dbo.D_Date

    WHERE FullDate >= mh.minH

    and FullDate <= mh.maxH

    and FullDate <= GETDATE()

    and dbo.timetoUTC(po.OrderStartDate) <= dd.fulldate

    and dbo.timetoUTC(po.OrderEndDate_dt) >= dd.FullDate

    ) dd

    join dbo.vwSwordAllocation sa (nolock)

    on sa.OrderID = po.orderno_int

    and dd.FullDate >= cast(sa.SwordIssued as date)

    and dd.FullDate <= isnull(cast(sa.Swordreturned as date),sa.OrderEnd)

    JOIN dbo.piratescontacts PC (nolock) on PC.contactid = po.subjectid

    JOIN dbo.serviceatsea sas (nolock)

    on sas.pirateorderid_lk = po.pirateorderid

    and dd.FullDate >= cast(sas.servicestartdate as DATE)

    and dd.FullDate <= cast(sas.serviceenddate as date)

    where pc.greenhorn = 0

    and pc.lastname not like 'Test%'

    and pc.firstname not like 'test%'

    and pc.excludefromrations <> 1

    and po.statusreason <> 3 --'void'

    group by dd.YearMonth,pO.ordertypedesciptionid,dd.MonthDateOfYear,dd.DateYear

    order by dd.YearMonth,dd.DateYear

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (1/31/2014)


    You might get some performance lift by jigging around with your filters. If you're going down this route then do take on board Gail's comments about DISTINCT. You might benefit from rolling up one of those child tables in a derived table.

    select distinct

    OrderType =

    case pO.ordertypedesciptionid

    when 'Bilge Rat' then 'BR'

    when 'Bilge Rat - Long Tail' then 'BR - Long Tail'

    when 'Scurvy Dog' then 'SD'

    when 'Jolly Jack Tar' then 'JTT'

    else pO.ordertypedesciptionid

    end

    ,dd.YearMonth

    ,dd.MonthDateOfYear

    ,dd.DateYear

    ,ServiceDays = COUNT(1)

    ,AverageServiceDays = COUNT(1)/dd.NumberOfDaysInCurrentMonth

    ,dd.NumberOfDaysInCurrentMonth

    from dbo.pirateorder po (nolock)

    cross apply (

    select

    minH = Min(cast(hff.HourStartDate as DATE)),

    maxH = Max(cast(hff.HourEndDate as DATE))

    from dbo.vwHoursbeforethemast hff (nolock)

    where po.orderno_int = hff.orderid

    ) mH

    CROSS APPLY (

    select

    NumberOfDaysInCurrentMonth = COUNT(*) OVER(PARTITION BY YearMonth),

    FullDate,

    YearMonth,

    MonthDateOfYear,

    DateYear

    from dbo.D_Date

    WHERE FullDate >= mh.minH

    and FullDate <= mh.maxH

    and FullDate <= GETDATE()

    and dbo.timetoUTC(po.OrderStartDate) <= dd.fulldate

    and dbo.timetoUTC(po.OrderEndDate_dt) >= dd.FullDate

    ) dd

    join dbo.vwSwordAllocation sa (nolock)

    on sa.OrderID = po.orderno_int

    and dd.FullDate >= cast(sa.SwordIssued as date)

    and dd.FullDate <= isnull(cast(sa.Swordreturned as date),sa.OrderEnd)

    JOIN dbo.piratescontacts PC (nolock) on PC.contactid = po.subjectid

    JOIN dbo.serviceatsea sas (nolock)

    on sas.pirateorderid_lk = po.pirateorderid

    and dd.FullDate >= cast(sas.servicestartdate as DATE)

    and dd.FullDate <= cast(sas.serviceenddate as date)

    where pc.greenhorn = 0

    and pc.lastname not like 'Test%'

    and pc.firstname not like 'test%'

    and pc.excludefromrations <> 1

    and po.statusreason <> 3 --'void'

    group by dd.YearMonth,pO.ordertypedesciptionid,dd.MonthDateOfYear,dd.DateYear

    order by dd.YearMonth,dd.DateYear

    Thanks Chris, I gave that a try and unfortunately it almost doubles the running time. It also threw the grouping out when I used the calculated numberofdaysinthecurrentmonth column. That being said, I didn't know you could a)calculate two columns in the same cross apply select and b)use two cross applies in a query.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • That's okay, it was always going to be a long shot with no sample data blah blah.

    I hope you can take away a few ideas from it though, and if you aren't already, then check the execution plan each time you change a query.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (1/31/2014)


    That's okay, it was always going to be a long shot with no sample data blah blah.

    I hope you can take away a few ideas from it though, and if you aren't already, then check the execution plan each time you change a query.

    I'll definitely have a look at the execution plans, although they're not something I'm all that familiar with. I've used the multiple column cross apply thanks. I still get dodgy counts but I do at least get them a bit quicker!

    I'll have a think about how to provide a data sample over the weekend. Like I said in my original post though, I know what the data should look like but it might be incorrect data causing the problem. I think I've narrowed it down to the counts now. I get the same number of rows when I run the proc and the view without the groups on them so I don't think there's anything either missing or duplicated.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Remove the Distinct in the view definition - that is going to eliminate rows from the grouping that are actually included in the procedure.

    The distinct will only occur after the grouping has occurred - so in your procedure the 'duplicate' rows are counted and then your distinct is applied. Using the view, the distinct is applied to the rows before they are grouped in the procedure.

    Also note: if you set transaction isolation level in the procedure you don't need to use nolock on every table.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams 3188 (1/31/2014)


    Remove the Distinct in the view definition - that is going to eliminate rows from the grouping that are actually included in the procedure.

    The distinct will only occur after the grouping has occurred - so in your procedure the 'duplicate' rows are counted and then your distinct is applied. Using the view, the distinct is applied to the rows before they are grouped in the procedure.

    Also note: if you set transaction isolation level in the procedure you don't need to use nolock on every table.

    Thanks Jeffrey, it was down to the distincts in the proc and the view. The distinct in the view means I get one row per pirate per day which is what I was aiming for. I was struggling to establish whether it was missing data in the view or duplicate data in the proc and we've worked out that it was duplicates in the proc (although it's more difficult to establish where they're coming from). I can go ahead now and use the view with confidence.

    What transaction level should I be setting it to? I assume, to make sure it's clean data I should be using read committed. I know it's not great but we are quite heavily reliant on the 'silver bullet' nolock method so we don't tend to use the transaction levels. I'm very much the new boy so I'm open to all (good) advice.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply