SELECT statement hangs on certain where conditions

  • I've encountered a bizarre issue today while investigating a job that suddenly stopped working.

    To explain - I was recently working on and implemented an automated database maintenance script (result of going to the awesome SqlBits conference in Manchester). The script did it's job well - up until now there was the Sql Server 2005 inbuilt maintenance plain in use - the one that performs reorganize on selected tables but no rescan (no statistics update). The script would reorganize/rescan tables with average fragmentation between 10 and 30% and rebuild any with above 30% fragmentation. It also rescans anything below reorganize threshold.

    I ran the script on our development database and it certainly sped some things up. On the down hand it also seems to have caused a particular job to hang all of a sudden. I've inspected the stored procedure and identified the problem being in a where statement condition:

    tableName.createdDate < DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0) This seemed odd since executing the select from table alone with given condition worked fine. The column createdDate (not actual column name) was indexed with a non-clustered index that depended on a primary key (clustered) of type BigInt. The table had another non-clustered index but all of the indexes were up to date (I checked fragmentation levels and statistics last update). Since this started happening after the maintenance script I first tried reorganizing/rebuilding the index in question and when that improved nothing the primary key (since that rebuilds any dependant non-clustered indexes). Anyway when all of that produced no viable results someone suggested hard coding the problematic where condition - supprizingly it worked! I put he condition DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0) into a DateTime varible and used the variable in the condition instead. What I suspect might have caused the problem lies in the estimated query execution plan. Please mind that the actual select had several inner/left joins. The condition was one of the first things to happen according to query plan and it performed a non-clustered index scan. The table on which the index operates has around 1 million records which means the index to scan would be rather big. But even so it shouldn't hang because it worked fine isolated. When I spoke with some other people who have been around when migration from Sql Server 2000 to 2005 took place they said same kind of problems stared occurring - selects stared hanging for no apparent reason. Could this be a bug in 2005? Has anyone else encountered this? Please mind I'm not a fully qualified DBA and please do correct me if I'm wrong on any points. I hope this helps anyone with similar issues. We didn't manage to fully explain what the problem is but a viable solution that works is to replace conditions with problems with variables.

  • Please provide your select query.........

    "Don't limit your challenges, challenge your limits"

  • Try updating the statistics on that table. Fullscan may be necessary.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I did not post the query as I can't post it as is and what I can post might not be useful at all. Find below a query with renamed tables/columns. I only posted this in case someone else comes across this problem as I've already found a workaround.

    SELECT

    A.a_id,

    C.c_id,

    B.b_id,

    D.d_id,

    E.e_id,

    F.g_id

    FROM table_G G

    INNER JOIN

    table_A A

    ON A.a_id = G.a_id

    INNER JOIN

    table_C C

    ON C.c_id = G.c_id

    LEFT JOIN

    table_B B

    ON B.a_id = A.a_id

    LEFT JOIN

    table_D D

    ON D.b_id = B.b_id AND D.c_id = G.c_id

    LEFT JOIN

    table_E E

    ON E.e_id = COALESCE(D.e_id, C.e_id)

    LEFT JOIN

    table_F F

    ON F.g_id = G.g_id

    WHERE

    F.g_id IS NULL

    AND A.created < DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0)

    The problem line I spoke of was "AND A.created ..." - table Table_A contains around a million records. Simply removing that line worked and putting the "DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0)" into variable and using variable instead also worked.

  • Thanks for suggestion but as I mentioned I already tried all of it 0 reorganize/full scan, rebuild on non-clustered index and then on clustered again - all of which made no difference. This only started happening after the script has already performed proper maintenance - in this case it only rescaned the table as it's initial fragmentation was 0.3 %.

  • Any chance of seeing the estimated execution plan?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • DATEADD(dd, DATEDIFF(dd, 0, GETDATE()), 0)

    what is 'dd' in above statement? Any column name or a variable?

    "Don't limit your challenges, challenge your limits"

  • The "dd" is a keyword argument to date functions (DATEADD, DATEDIFF) and it signifies the days to add for example. You could also add minutes, seconds, ...

    What the whole thing you pointed out does it "zeroes out" the date. So if GetDate() for example returns "2009-04-16 12:35:59" the resulting date would be "2009-04-16 00:00:00". In the example that I posted we want to capture any records created up to today, excluding today - thus smaller than "zeroed out" current date.

  • Try this...

    declare @date datetime

    set @date = round(cast(getdate() as float),0,1)

    SELECT

    A.a_id,

    C.c_id,

    B.b_id,

    D.d_id,

    E.e_id,

    F.g_id

    FROM table_G G

    INNER JOIN

    table_A A

    ON A.a_id = G.a_id and a.Created < @date

    INNER JOIN

    table_C C

    ON C.c_id = G.c_id

    LEFT JOIN

    table_B B

    ON B.a_id = A.a_id

    LEFT JOIN

    table_D D

    ON D.b_id = B.b_id AND D.c_id = G.c_id

    LEFT JOIN

    table_E E

    ON E.e_id = COALESCE(D.e_id, C.e_id)

    LEFT JOIN

    table_F F

    ON F.g_id = G.g_id

    WHERE

    F.g_id IS NULL

    In general, it is better to put conditions (where possible) into inline views instead of the where clause. That way, the number of rows is determined before the potentially billions of other rows introduced by the joins are considered. Also, if you're computing a value, do it in a variable before you put it in a join - it's a kazillion times faster. Another good way would be

    declare @date datetime

    set @date = round(cast(getdate() as float),0,1)

    SELECT

    A.a_id,

    C.c_id,

    B.b_id,

    D.d_id,

    E.e_id,

    F.g_id

    FROM table_G G

    INNER JOIN

    (select * from table_A where created < @date) A

    ON A.a_id = G.a_id

    INNER JOIN

    table_C C

    ON C.c_id = G.c_id

    LEFT JOIN

    table_B B

    ON B.a_id = A.a_id

    LEFT JOIN

    table_D D

    ON D.b_id = B.b_id AND D.c_id = G.c_id

    LEFT JOIN

    table_E E

    ON E.e_id = COALESCE(D.e_id, C.e_id)

    LEFT JOIN

    table_F F

    ON F.g_id = G.g_id

    WHERE

    F.g_id IS NULL

    if you don't like inline views.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply