How to get a query to sort in memory not disk ?

  • I'm running a very simple query:

    select col1 from tab1 where col2 between 'x' and 'y' order by col3

    on a table with 800,000 rows.

    The query returns 25000 rows but takes 200 seconds (case1).

    Col1 is defined as an NCHAR(768).

    If I rebuild the table (case 2) with col1 defined as NCHAR(32), then the query takes 2 seconds.

    In case1, the disk is pegged at 100% throughout the query.

    Why should a "long" column in the select list make this query so slow ?

    My best guess at the moment is that it is sorting on disk (case1) and

    in memory (case2).

    Is there a way to confirm whether it is sorting in memory or disk ?

    Is there a way to force the sort to happen in memory (2GB is available) ?

    I've tried both SQL Profiler and Performance Monitor so far.

     

     

     

     

     

     

  • >>Why should a "long" column in the select list make this query so slow ?

    Because a long nchar column drastically reduces the number of rows per data page, increases considerably the number of data pages required by the table and increases considerably the amount of disk I/O necessary to get the data into memory cache.

     

  • Here's an example of the impact, using temp tables, and just 10,000 rows. You can imagine what this scales to in your 800K sample. A SQL server data page is 8192 bytes, so take the number of reported pages and multiply by 8192 to detemine the disk IO cost to reading in an 800K row table with an nchar(768) column:

    Create table #test1

    (

      TestCol nchar(768)

    )

    Insert Into #test1

    Select top 10000 'Test Data'

    From master..syscomments as c1

    cross join master..syscomments c2

    Create table #test2

    (

      TestCol nchar(32)

    )

    Insert Into #test2

    Select top 10000 'Test Data'

    From master..syscomments as c1

    cross join master..syscomments c2

    dbcc checktable('#test1')

    dbcc checktable('#test2')

     

  • What you should do is check your indices...

    Are these the only columns in your table?  How distinct are the values in col3?  For example, you could index based on

    col3, then col2 then col1 - this will let SQL Server get the data in sorted order and then the next column of the index will let it easily perform your filter and the same index can return the column... Of course you could use the table's clustered index for this purpose if appropriate.

    Similarly you could also index based on

    col2, then col3 then col1 - this will let SQL perform your filtering very efficiently (betweens are great for index use - again clustered may be appropriate depending on other uses of the table).  Then it can sort efficiently.  Then the data is already in the index to return.

    If you are going to use a clustered index then you needn't include col1 in the index.

    The disk cost of reading your data would be a lot higher if you also had other columns in the table - having a separate index cuts this down significantly as only the index is examined...

    Could you post the query plan?

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply