SQL Tuning where no PK exists

  • Just asked to have this thread moved to a SQL Server 2005 forum.

  • GilaMonster (11/13/2009)


    Jeff Moden (11/13/2009)


    Gail, since this is 2k, do you think a partitioned view would help at all here? Or would it just muck things up?

    Haven't worked much with them, that's why I never recommend them. Couple times I did had unfavourable results, but that may have been because of bad design. Basically comes down to whether the partition elimination can be correctly done.

    It's ironic... I've run into them a handful of times and each time I've also had unfavorable results. I check the partitioning constraint, etc, etc... they still don't seem to always work as advertised. Glad I'm not the only one having problems with partitioned views in 2k.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • SQLNewbster (11/13/2009)


    Umm, sorry. I didn't realize this was the SQL 2k forum. I just joined when I posted this. Many apologies for that.

    NP... it's a common newbie mistake to post in the wrong forum.

    Since you're using 2k5, you could use a partioned table instead of a partition view. I've not had to setup a partitioned table on anything but a realitively small (million row) test table because, until recently, I've been stuck in the 2k world... but it might be something that could help you a bit in this situation. Check it out in Books Online...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I will take a look at table partitioning as a read, but I don't think that is going to help me out. Since the vendor owns the schema, I can only make suggestions on how to get better performance. I am limited by getting the best performance from my queries..

    Thanks for your help..

  • SQLNewbster (11/13/2009)


    Umm, sorry. I didn't realize this was the SQL 2k forum. I just joined when I posted this. Many apologies for that.

    No worries. We just use the forum that a question is in to determine which features we can and cannot use. I didn't notice the forum initially, when I was talking about includes, just went on what you said. Looked at the forum when Jeff posted only.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Let me see if I have this right: you are dumping 120K relatively fat rows into a TEMPORARY TABLE then selecting those rows back out and it is taking .20 seconds and you want FASTER performance?? Unrealistic expecations IMHO.

    I note that the covering index you created has a very large percentage of the table in it. Lots of maintenance overhead here if the data gets modified very often. Looks like something DTA or missing indexes subsystem created. 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Here is the sequence of things that have occurred and/or a summary of the information:

    1. I am working with a vendor's backend system, using that data to do analysis. I am constrained by the vendor's data structures, etc...

    2. I noticed when I started querying the table in question, there was no PK. The table could feasibly have hundreds of millions of records.

    3. I am working to get a meaningful subset of data for the clients to do analysis without having acquiring tons of data that isn't useful or needed.

    4. My initial query where I filter down to a speciific company_no, year_for_period, period was conducting a table scan of the 8 million rows of data to return 120k of those rows..

    5. I created an index (nonclustered) initially on company_no, year_for_period, period and immediately saw a pickup. The 120k rows returned in about 8 seconds 9(likely cached).

    6. For my application (ADO recordset), I need to be able to create a unique key. In order for me to do that, I created an identity column. Now with that sequence (create temp table, insert rows and select those rows) is about 24 seconds.

    If this likely the best I can expect and no tuning recommendations, I will have to be satisfied with that performance. I just wanted to make sure and get some more feedback from more advanced minds...

  • SQLNewbster (11/14/2009)


    Here is the sequence of things that have occurred and/or a summary of the information:

    1. I am working with a vendor's backend system, using that data to do analysis. I am constrained by the vendor's data structures, etc...

    2. I noticed when I started querying the table in question, there was no PK. The table could feasibly have hundreds of millions of records.

    3. I am working to get a meaningful subset of data for the clients to do analysis without having acquiring tons of data that isn't useful or needed.

    4. My initial query where I filter down to a speciific company_no, year_for_period, period was conducting a table scan of the 8 million rows of data to return 120k of those rows..

    5. I created an index (nonclustered) initially on company_no, year_for_period, period and immediately saw a pickup. The 120k rows returned in about 8 seconds 9(likely cached).

    6. For my application (ADO recordset), I need to be able to create a unique key. In order for me to do that, I created an identity column. Now with that sequence (create temp table, insert rows and select those rows) is about 24 seconds.

    If this likely the best I can expect and no tuning recommendations, I will have to be satisfied with that performance. I just wanted to make sure and get some more feedback from more advanced minds...

    A) the table scan plan is probably best, since 120K rows is 1.5% of the total rows. If you force an index seek I would expect more IOs and higher cost (although sometimes this plan can actually be faster in actual runtime.

    B) why do you need a PK if the data is for analysis only

    C) have you checked for network delays? 120K fat rows is a good bit of data and will take time to packetize, send, depacketize and convert into a client-side recordset. How long does it take to simply stream the data to SSMS grid/text output?

    D) as table rowcount grows there will be a break point where the index seek/bookmark lookup becomes more efficient cost wise, assuming the 120K output count doesn't grow correspondingly. This could eventually mean that a clustered index on the main search columns would be much more helpful.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply