Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Get your favorite SSC scripts directly in SSMS with the free SQL Scripts addin. Search for scripts directly from SSMS, and instantly access any saved scripts in your SSC briefcase from the favorites tab.
Download now (direct download link)

A google like search

By Jean Bulinckx,

 The problem:

We must implement a google like search over a existing database. The database contains several tables for persons, corporations, organizations, addresses, etc.
It must be a custom solution (no thirdy-party solutions neither SQL FTS).
Its for a new web application (dozen to hundreds simultaneous users, response time under 1s for most cases, 5s for worst case).
Strategy:
We ill create a single table (registers) to point to the existing tables. We ill need also a table of tags and a relationship table (registers x tags).
A batch ill scan the relevant tables (persons, organizations, addresses, etc) and populate the tables (registers, tags, tag_register) once a day.
Rules:
The search ill be performed in all registers at once. 
A "register" is a abstraction for a person a organization or any other object.
The object's data lie in a "main" table (persons table, organizations table, etc)
and in many "members" tables (addresses, contacts, profile, etc).
A "tag" is a word wich ocurrs in any of the table's text fields.
In a search for tag1, tag2 and tag3
If exists: tag1 AND tag2-100 ocurrences, tag1 AND tag3-10 ocurrences
and there's no ocurrences of tag2 and tag3 for the any register we ignore tag3
since this returns mores results than ignoring tag2.
Some objects share many properties and the search can use some properties to filtering and/or ordering,
in this example we ill simple use the name property to populate a column (vc_register) to be used for ordering.
Conclusion:
Below is the script to create the tables and procedure and populate the database with dummy data.
The procedure performs the search in a reasonable time and was implementd as a conceptual proof.
We gave a try to table variables and creating/droping indexes in temporary tables but we found no indexed temporary tables worked better.
Also the FK constraints increase performance is better than we can have guessed.
Increasing the number of tags dont impact in performance, increasing the number of registers dont impact directly in performance.
If a single tag is present in millions of registers returning all matches for it can slow down the response time.
The response time is dicted by the number os comparasions to find the intersection between to matches (join each tag search result).
Chosing wisely the words from the database to populate the tags table ill impact directly in performance.
Words like "the", "and", "a" must be avoided (low meaning, high ocurrence)
We ill must keep a eye in the most common used tags and the tags with the highest ocurrences.

Total article views: 3374 | Views in the last 30 days: 27
 
Related Articles
FORUM

Searching complete database

Searching complete database

FORUM

Database Performance

Database Performance

FORUM

Database Performance

How to check database performance

FORUM

Using LIKE for text search

Explanation needed with text search performance

FORUM

Database performance issue

Database performance issue

Tags
clustered    
fnord    
google    
indexing    
performance tuning    
procedure    
search    
temporary tables    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones