Select statement of view stays 5 hours to return results......

  • Hello,

    We've got created the following view (1.), in a database that have no more then 350Mb :w00t: but the result of the select statement (2.) stays among 5 hours to return results :crazy:

    I'm completely lost and i really dont understand what part of the code i must change :blush:

    Sorry!!! Can you help me or can you give me any idea so that i can change the code of this "crazy view".....

    Happy new year

    Thanks and regards,

    JMSM 😉

    1.

    ------------- Create View Code ------------------

    use [DBName]

    GO

    create view [dbo].[VwNAME] as

    select [ID]

    ,[Login]

    ,FLPExp

    ,FLKHw

    ,PUACnt

    ,FLWRArt

    ,FLWRBPst

    ,FLWDnlds

    ,case when FLPExp = 1 and FLKHw = 1 and PUACnt > 1 then 100

    when FLPExp = 1 and FLKHw = 1 and FLWRArt = 1 then 75

    when FLPExp = 1 and FLKHw = 1 and FLWDnlds = 1 then 50

    when FLPExp = 1 and FLKHw = 1 then 25

    else 0

    end 'PercentPrf'

    from (

    select

    a.id as 'ID',

    a.ACCName as 'Login',

    case when count(distinct b.title)>0 then 1 else 0 end FLPExp,

    case when count(distinct f.idknowhow)>0 then 1 else 0 end FLKHw,

    count(distinct d.id)+count(distinct e.IDArtcl) PUACnt,

    case when (count(distinct g.id) + count(distinct h.id))>0 then 1 else 0 end FLWRArt,

    case when count(distinct i.id)>0 then 1 else 0 end FLWRBPst,

    case when count(distinct j.CntDnlds)>0 then 1 else 0 end FLWDnlds

    from Athr a

    left join PstExp b on a.id = b.idAthr

    left join Artcl d

    on a.id = d.idAthrSubMtr

    and d.status = 2 -- Aproved

    left join (select e1.id, e2.idAthr, e2.IDArtcl

    from Artcl e1,

    ArtclcoAthr e2

    where e1.id = e2.IDArtcl

    and e1.status = 2) e -- CO Artcls

    on a.id = e.idAthr

    left join Athrknowhow f on a.id = f.idAthr

    left join Artcl g -- Artcls less 30 days

    on a.id = g.idAthrSubMtr

    and datediff(d, g.PblDte, getdate()) < 30

    and g.Status = 2

    left join (select h1.id, h2.idAthr

    from Artcl h1,

    ArtclcoAthr h2

    where h1.id = h2.IDArtcl

    and datediff(d, h1.PblDte, getdate()) < 30

    and h1.status = 2) h -- CO Artcls less 30 days

    on a.id = h.idAthr

    left join UsgWkiBlg i -- Posts less then 30 days

    on a.id = i.idAthr

    and i.ShPntTypSWeb = 1

    and datediff(d, i.LstMdfation, getdate()) < 30

    left join UsgeAthrArtcl j -- downloads

    on a.id = j.idAthr

    where a.locked = 0

    group by a.id, a.ACCName ) a

    go

    ---------------- End View Code ------------------

    2.

    ---------------- Select Statement ---------------

    SELECT [ID]

    ,[Login]

    ,FLPExp

    ,FLKHw

    ,PUACnt

    ,FLWRArt

    ,FLWRBPst

    ,FLWDnlds

    ,[PercentPrf]

    FROM [DBName]. [dbo].[VwNAME]

    GO

    ------------ End Select Statement ---------------

  • It's hard to make something up from your create view statement as it doesn't have a proper lay-out, which makes it hard to read.

    At first glance I see you have a lot of joins, including outer joins, with functions in the where clause. You should try to avoid functions in the where clause, as indexes will probably not be used. Also, the many COUNT DISTINCT aggregates can also slow things down.

    If possible and necessary, replace the "on-the-fly" calculated columns from the where clauses with real persisted computed columns.

    You should probably also check out if there are any indexes and if there are, if they are useful.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thanks a lot Koen.

    I'll do the right checks and i'll tell you the solution.

    Once more thanks.

    Regards,

    JMSM 😉

  • There are several approaches you can use:

    Most important:

    instead of applying a function to your columns rewrite it as SARGable arguments:

    datediff(d, i.LstMdfation, getdate()) < 30

    -- would be

    i.LstMdfation>dateadd(dd, -30, getdate())

    Another option that usually helps is preaggregation.

    Find the table that would be reduced the most by your where conditions and populate that table as an indexed temp table and use this table in your query.

    Furthermore, you're calling Artcl at least four times. Maybe the query itself can be rewritten to be much more efficient.

    And finally, there might be a missing index issue, too. Check the execution plan for best index usage.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I agree with what everyone else has said. I think they've caught the obvious stuff. Now, to keep going, can you post the execution plan from the query, preferably after you fix the functions that are preventing index use. That can help to figure out if you have good indexes in place and they're getting used properly.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks a lot LutzM.

    I'll change the function as u tell me.

    Regards,

    JMSM 😉

  • JMSM (1/9/2011)


    Thanks a lot LutzM.

    I'll change the function as u tell me.

    Regards,

    JMSM 😉

    The function issue is not the only problem (might not even be the most relevant one).

    So, please follow Grants advice and post the actual execution plan.

    And don't worry too much about Celko's post... If you'd take the time trying to read and understand what his real intention is, you'd probably have rewritten your code by then anyway... 😉

    Beside he really knows SQL stuff (at least what rumour tells) he's got that "special" online personality. The bottom line: He doesn't like the way you name your columns, tables, and views. Doesn't change performance at all, but would help readability and maintainability, I admitt. But that's not the major concern here, I guess.... 😉

    He's right about the DISTINCT issue though...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hello again,

    I've attached the execution plan and the code of both views, the oldest and the new one.

    I've change yet the column names, and the tablename on both views.

    Thanks and regards,

    JMSM 😉

  • JMSM (1/10/2011)


    Hello again,

    I've attached the execution plan and the code of both views, the oldest and the new one.

    I've change yet the column names, and the tablename on both views.

    Thanks and regards,

    JMSM 😉

    There's just a ton of things going on here. First off, all those DISTINCT operations, they're killing your performance. You need to reevaluate if you really need them, and if so, I'd suggest building a materialized view with this data already aggregated (DISTINCT is an aggregation operation).

    It also looks like you're missing indexes all over the place. Just picking one out

    left join pastexperience b on a.id = b.idauthor

    This results in a merge join on just a few records. From what I can see, I'd try an index on the PastExperience table on the column IdAuthor and probably would add the Title column to the INCLUDE list.

    There are probably lots of other opportunities for indexes. On a guess, you don't have any indexes on your foriegn keys do you?

    That's my 2 cents worth. I'm sure others will pick up on more.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (1/10/2011)


    This results in a merge join on just a few records. From what I can see, I'd try an index on the PastExperience table on the column IdAuthor and probably would add the Title column to the INCLUDE list.

    Grant,

    I'm suddenly worried that once again a core item to my understanding is invalid. Merge Joins are supposed to be the fastest. It's the "card-shuffle" join, one pass on both sides of the data and bam, done. Why would you go out of your way to avoid that?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Hello again,

    I'm really but really sorry, but i'm completely lost and hope that anyone explain me what should i do like if i was a donkey :crying:

    Sorry but i really dont know what is wrong or what is right.. :sick:

    Thnaks and regards,

    JMSM 😉

  • *nearly sprays his tea all over the monitor upon opening the execution plan*

    Yikes. Wow. :w00t:

    Now, THAT'S a Table Spool. Just the estimate, I know, but a 101million row spool? No wonder those distincts are crying.

    The materialized view that Grant's talking about above is an indexed view. Creating that will cause tremendous pain whenever you edit underlying data though. What I would be more inclined to do is 'snapshot' this data into a result table overnight, and then use it as a baseline for anything during the day. It'll be up to 24 hours out of date... but that hopefully isn't much of a concern here since this query runs from the beginning of time. I didn't see any filters on it, anyway...

    JMSM, this is not a small and easy to fix item. To quote Grant: "There's just a ton of things going on here." You're talking we'd need to see the DDL and indexing for all the involved tables. We'd need to have expected use cases, since it looks like you're building 'one view to rule them all', and intend on applying where clausing afterwards. We'd need sample data, and we'd need access to some of the selectivity histograms to really fix this properly.

    There is no 'quick fix' to this. You've got a ton of data and a ton of different concurrent calculations.

    Really, this is the kind of thing you usually hire specialized consultants for. Is this view eventually used for some kind of data warehouse, or business intelligence? It's practically screaming that it should be going to SSAS for final reporting.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • It's just not as easy as you might think to work on such a rather large query without any sample data to play with ... 😉

    All we have is the view definition and the execution plan. So there will be some questions to help us understand as well as a discussion between the people trying to help you.

    One question would be:

    How many rows are in Artcl in total and with Status = 2?

    I noticed you select only data with that specific status. So it might help to build a temp table holding only data from Artcl with status =2.

    It might be worth to include a join to ArtclcoAthr when building the temp table and add a column that would represent the result of datediff(d, h1.PblDte, getdate()) < 30. Maybe some aggregation can be done at this level, too. The temp table would replace the two subqueries.

    It would be much more efficient though, if we could get the table defs for all related tables including all indexes as well as some representative (but fake) sample data...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • A couple more points to ponder on :

    StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="TimeOut"

    That means that the server has given up on finding a "decent" plan to execute that query. So breaking it into a couple step will definitely help here.

    CompileTime="16342"

    That means that even after 16 seconds of processing, the server couldn't come up with a good plan.

    Is the server powerfull enough to run your normal day to day loads???

    We can help all day long, but in the end I think you're better off hire a consultant out for this problem to really solve it correctly and get a ton of pointers and experience in the process. I think you're in way over your head, way over what you can even imagine.

  • ... and 1 more point.

    All those spools are generating 59 GB of data... 172 times the total size of the DB.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply