Search billions of records from multiple tables

  • I am looking for a solution which can satisfy the following conditions

    1. If you are looking for search operation in multiple tables and each table contains millions of records

    2. Sql Server 2008 is the RDBMS

    3. You are not allowed to to use Full Text Indesing Service because of Azure limitations

    4. Then What will be the best of the best practice

  • Sounds like a homework question to me. Have you Googled your questions yet? Or read SQL Server Books Online?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Yes we are planning to build up a multi tenant application that has millions/billions of records. To avoid the application to crash after a month, I am looking for a best practice/pattern. Your help in this regard will be highly appreciated.

    I have already implemented searching mechanism in one application but that is based on Full text Indexing Service and that run successfully over the years but unfortunately it's a limitation in this project.

  • How is full text searching a limitation if you are allowed to use it?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • "You are not allowed to to use Full Text Indesing Service because of Azure limitations" apologies for skipping not

  • You don't need Full Text Indexing necessarily to do a search. If your indexes and statistics are good enough, then you can search billions of records without a problem.

    The problem is, there is no best practice as a whole. It is entirely dependent on your schema and your specific situation.

    Best advice I have right now, gather your business requirements, sit down with your users and discuss what else besides this search that they need, then create a logical model of the database. Once you have a good idea of what is needed, you can build the physical model with an eye towards what the business uses for the database will be.

    EDIT: And if none of what I said makes sense, it's time to bring in an outside consultant who can help you with this. This is too large of a thing to easily condense down into a few posts on a forum.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • I know that this depends on DB Design. We are already in development phase and I know we can use UNION statements for combining results from different queries, but I have doubt whether It will be efficient enough to return results in anticipated time.

  • Kausar Mehmood (4/13/2012)


    I know that this depends on DB Design. We are already in development phase and I know we can use UNION statements for combining results from different queries, but I have doubt whether It will be efficient enough to return results in anticipated time.

    Options include Temp tables, Common Table Expressions, Join statements, staging tables, partitioned data, Views...

    I'm sure there are other options that I can't name off the top of my head. The problem is that there is no right answer to what you're asking. All you can do is experiment.

    Consider throwing the data into a denormalized data warehouse (pulled from your main database) and having reporting come from that instead of your OLTP db.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • What i have concluded right now after some R&D is that Partition table can solve the issue. It will not only ease Manageability issue but also improve performance. What do you think?????

  • Kausar Mehmood (4/16/2012)


    What i have concluded right now after some R&D is that Partition table can solve the issue. It will not only ease Manageability issue but also improve performance. What do you think?????

    If your R&D said it would work, I say go for it.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • What's your experience say's, will this do the job? Doing something and reading something is altogether different job. I will use CTE, UNION and Paging along with Table Partitioning. But I doubt Table partition is supported in Standard Edition.

    Many thanks in advance.

  • My experience says that the best solution is entirely dependent on your database schema. As I don't know what your schema is, and I don't know anything about your data, I can't tell you exactly what to do. I'm sorry.

    If it worked for you in R&D, then give it a shot in production. What I tend to do, though, is to test multiple methods in Dev, then pick the best performing one. Because what performs well one time does not perform well in a slight different scenario.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Kausar Mehmood (4/16/2012)


    What's your experience say's, will this do the job? Doing something and reading something is altogether different job. I will use CTE, UNION and Paging along with Table Partitioning. But I doubt Table partition is supported in Standard Edition.

    Many thanks in advance.

    You can not create and dynamically manage partitioned tables in Standard Edition, you need the Enterprise Edition for that.

  • Ok I can share some information i.e I may fetch records from upto tables. The query may contains statements like UNION. e.g. ArtifactLibrary(ArtifactId,Title,CreationDate,....etc), Users(UserId,FirstName,LastName,etc...),

    Groups(GroupId,GroupName,etc...)

    ArtifactLibrary may contains records around 10Million, Users may have 1 million, Group around 100K. I have to search the tables for different criteria say ArtifactTitle, UserFirstName,GroupName etc. Can you please guide me in this regards.

    Best Regards

  • You may sit and consider how you are going to search your data. based on that ,create proper indexes and stats and that should reduce your problem to half.

    Then you may go and consider other things like partitioning, additional memory...

    Start from basic design and then move to complex things untill you reach at an acceptable point in terms of performance.

    You may not get everything perfect at the very beginning.

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply