General Search/SQL Performance question

  • Sorry, just realised this post should probably be in SQL 2005 (if anywhere!). Interesting discussion topic, perhaps!....

    I've been an architect/developer professionally for a few years now since leaving university, but been coding for many years. I have one question that I'm trying to understand, and it might be possible in SQL, but it might not. By the way, I'm definately not beginner by the way!

    I have a smallish SQL Server 2005 database, with 25 tables with solid DB design. My main two tables have around 500,000 rows in each which seems fair. It's powering an ASP.NET front end, but this is kind of irrelevant. Part of the ASP.NET front end is that the user should be able to search on the data in the database and it'll bring back a ranked list etc. There are 8 criteria in the search, and one is postcode (and it calculates the distance). It's also using a simple full text search on two fields.When running on a quad xeon with 8GB RAM it takes around 3 seconds per search which although is pretty decent, when there are 100 people searching at a time it's unacceptable.

    Now there are quite a few websites, such as DatingDirect, that can search 15 criteria... calculate distances... rank... etc in 0.05 seconds, and they have literally millions of rows that they are querying.

    Is this sort of performance even achievable using someting like SQL Server?! If not, how are these companies achieving such performance?

    Forgive me if this isn't the place to have this sort of discussion!

    Regards,

    Stu

  • Yes it is possible in SQL server, For an example, I run a SP, that will take 3 parameters ad produce 5 Best matches for the parameters provided, this SP searches 6 tables and provides result in less than a second. blindly speaking it depends upon the index on the table and fine tuning the Query or SP again and again until you get the result that you need, and then maintain that from there 🙂

  • I might redesign the query and get back to people. I was just wondering if large enterprises use anything different!

  • There is not one particular answer for this, there are many ways you can do it, and its not only the SQL that you have to check, you have to look into table structure, server, disk, indices etc..., and then tune them to get a desired result

  • I'm happy with the table structure, and the server. I'm not overly informed on index tuning to the max, but from what I know it shouldn't be a massive problem!

  • 500,000 rows is not a huge number in SQL by any means, and I am sure that you could get quicker results from your query. Though this does depend on a number of factors that have been mentioned.

    A good place to start is to have a look at the query that is causing concern, have a look for any inefficent code, you say that you are doing calculations depending on how these are been handled then there could be room for improvment.

    also have a look at your execution plans to see where the query is taking a lot of time. This could lead to re-writing bits of the code or creating new indexes.

    If possbile post some sample data and the offending query here, and i am sure someone will be able to give some advice.

  • You really need to get the execution plans to understand where your bottlenecks are. It might be indexes or it might be TSQL or it might be your structure. Remember, the best structure for supporting your OLTP system is not necessarily the best structure for supporting reporting. A lot of the bigger web sites you're referencing have specialized reporting structures, databases and systems, that are focused on delivering that data as fast as possible and have a different set of structures and processes for collecting data.

    From the sounds of your criteria, I don't think you're big enough to have to go that far yet.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ye, It's definately a small database as far as SQL Server is concerned. The reason why the query might be taking so long is that it needs to join across 10 tables to get the details. Index view might help?

    It also two full text searches, paging (which means nested select to use the rank function?), a select case for the order (would it be better to duplicate the query and have this hard coded?), and it's calculating the distance using pythagorus theorum in a C# UDF. Doing a straight count on the query takes no time at all - it's just the complicated where clause(s) values returned I feel.

    Maybe it needs a redesign!

  • incidently, the FTS part of the query is 45% of the entire thing (:w00t: )

  • Maybe someone can shed some light on this. I'm joining on freetexttable and containstable in my query. This all runs smooth enough... however, the massive spike (from 1 second to 11 seconds) is when I add in the where clause something like:

    where coalesce(ct1.rank,0) *5 + coalesce(ct2.rank,0) *1 > 0

    I'm left joining on the contains table, which is fine - because I want to rank the search based on the rank from ct with a weight of 5, and the rank from ct2 with a weight of 1. CT1 and CT2 are looking on different fields from the same table, by the way.

    I'm guessing, therefore, that if I could do this all in one full text search query then the whole thing will fly. Maybe I should be putting my attention on that. I've looked at my indexes and I'm happy that they're pretty good really, so that isn't the problem. As I said, it runs at 1 second when I don't have rank calculation in the where clause!

  • The problem is that you are doing the calculation on the left side of the clause, this leads to a non-sargable query that the optimizer will struggle with

    http://groups.google.com/group/huyuhui/web/a-term-sargable

    there may be an oppurtunity to re-write this logic and speed the query up

  • could you supply us with the query plan and or the actual query and any indexes you have on the tables in use.

    This way we can see what is being used and why and help you from there 🙂

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Just wondering if you could use ISNULL instead of COALESCE, as I've seen several posts inidicating it's faster, and perhaps with 500,000 records, that might add up? Also, perhaps you could calculate the rank with ISNULL as well as perform it in the SELECT and use TOP n instead of WHERE?

    Steve

    (aka smunson)

    :):):)

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Using COALESCE or ISNULL in the WHERE cluase usually prevents SQL Server from using an index on the query if one exists. Forcing the query to perform only table or index scans slows things down.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply