December 1, 2009 at 3:31 am
I have a rather complex database for which I want to provide a global search option in a SQL Server 2008 database.
Basically I'd like to know what the best practices are for this.
I have tried two different approaches to indexing the tables
1. Index every nvarchar column
2. Create two extra columns for each indexed table ([IndexedContent] varbinary(MAX), [IndexedType] nvarchar(128)) - the former contains a concatenated string of all nvarchar columns, the latter contains '.html'. Then indexing [IndexedContent] with [IndexedType] as type column.
The search queries (stored procedures) look like this:
1. Indexing every nvarchar column
SELECT
'Client' AS [Type], [Clients].[ClientID] AS [ID], NULL AS [ParentID], [Clients].[Number],
([Clients].[GivenName] + ' ' + [Clients].[FamilyName]) AS [Name]
FROM
[Clients]
LEFT OUTER JOIN [Patients] ON [Clients].[ClientID] = [Patients].[ClientID]
WHERE
@Clients = 1 AND
(FREETEXT([Clients].*, @Query) OR FREETEXT([Patients].*, @Query))
UNION
SELECT
'Patient' AS [Type], [Patients].[PatientID] AS [ID], [Patients].[ClientID] AS [ParentID], [Patients].[Number],
([Patients].[Name] + ' ' + CASE WHEN [Breeds].[Name] IS NOT NULL THEN '(' + [Breeds].[Name] +')' ELSE CASE WHEN [Species].[Name] IS NOT NULL THEN '(' + [Species].[Name] + ')' ELSE '' END END) AS [Name]
FROM
[Patients]
LEFT OUTER JOIN [Clients] ON [Patients].[ClientID] = [Clients].[ClientID]
LEFT OUTER JOIN [Species] ON [Patients].[SpeciesID] = [Species].[SpeciesID]
LEFT OUTER JOIN [Breeds] ON [Patients].[BreedID] = [Breeds].[BreedID]
WHERE
@Patients = 1 AND
(FREETEXT([Patients].*, @Query) OR FREETEXT([Breeds].*, @Query) OR FREETEXT([Clients].*, @Query))
UNION
SELECT
'Referal' AS [Type], [Referals].[ReferalID] AS [ID], NULL AS [ParentID], NULL AS [Number],
([Referals].[Name] + CASE WHEN [Referals].[Contact] IS NOT NULL THEN ' (' + [Referals].[Contact] + ')' ELSE '' END) AS [Name]
FROM
[Referals]
WHERE
@Referals = 1 AND FREETEXT([Referals].*, @Query)
UNION
SELECT
'User' AS [Type], [UserID] AS [ID], NULL AS [ParentID], NULL AS [Number],
([GivenName] + ' ' + [FamilyName]) AS [Name]
FROM
[Users]
WHERE
@Users = 1 AND FREETEXT([Users].*, @Query)
ORDER BY
[Name]
2. Indexing one varbinary(MAX) column
SELECT
'Client' AS [Type], [Clients].[ClientID] AS [ID], NULL AS [ParentID], [Clients].[Number],
([Clients].[GivenName] + ' ' + [Clients].[FamilyName]) AS [Name]
FROM
[Clients]
LEFT OUTER JOIN [Patients] ON [Clients].[ClientID] = [Patients].[ClientID]
WHERE
@Clients = 1 AND
(FREETEXT([Clients].[IndexedContent], @Query) OR FREETEXT([Patients].[IndexedContent], @Query))
UNION
SELECT
'Patient' AS [Type], [Patients].[PatientID] AS [ID], [Patients].[ClientID] AS [ParentID], [Patients].[Number],
([Patients].[Name] + ' ' + CASE WHEN [Breeds].[Name] IS NOT NULL THEN '(' + [Breeds].[Name] +')' ELSE CASE WHEN [Species].[Name] IS NOT NULL THEN '(' + [Species].[Name] + ')' ELSE '' END END) AS [Name]
FROM
[Patients]
LEFT OUTER JOIN [Clients] ON [Patients].[ClientID] = [Clients].[ClientID]
LEFT OUTER JOIN [Species] ON [Patients].[SpeciesID] = [Species].[SpeciesID]
LEFT OUTER JOIN [Breeds] ON [Patients].[BreedID] = [Breeds].[BreedID]
WHERE
@Patients = 1 AND
(FREETEXT([Patients].[IndexedContent], @Query) OR FREETEXT([Clients].[IndexedContent], @Query))
UNION
SELECT
'Referal' AS [Type], [Referals].[ReferalID] AS [ID], NULL AS [ParentID], NULL AS [Number],
([Referals].[Name] + CASE WHEN [Referals].[Contact] IS NOT NULL THEN ' (' + [Referals].[Contact] + ')' ELSE '' END) AS [Name]
FROM
[Referals]
WHERE
@Referals = 1 AND FREETEXT([Referals].[IndexedContent], @Query)
UNION
SELECT
'User' AS [Type], [UserID] AS [ID], NULL AS [ParentID], NULL AS [Number],
([GivenName] + ' ' + [FamilyName]) AS [Name]
FROM
[Users]
WHERE
@Users = 1 AND FREETEXT([Users].[IndexedContent], @Query)
ORDER BY
[Name]
The perfomance for 2 is marginally faster than 1 in some cases.
The tables contain up to 50,000 rows each but I will be implementing search on another table [Journal] which will contain maybe up to 1,000,000 rows.
Presently a search takes anywhere between 15 seconds and a full minute on my development machine (Windows 7 Ultimate 64-bit, SQL Server 2008 SP1 64-bit, 2,33 GHz Core 2 Duo, 4 GB RAM).
I'm currently looking to optimize the database and the queries, not looking for suggestions about bigger hardware or moving the full-text catalog to a different server as that is not an option for the time being.
Any suggestions will be greatly appreciated.
December 2, 2009 at 10:30 am
I'm not sure that the full-text search is the main culprit for the lack of performance.
Can you eliminate the "UNION" and ORs from your code?
December 2, 2009 at 2:48 pm
Hi dmoldovan,
Thanks a billion for your suggestions.
Removing the UNIONs did nothing but removing the ORs and replacing them with two SELECT statements is an entirely different case: the execution time for the longest query drops down to 0.01 seconds!
Sometimes it's just the simple things.
Final code:
SET NOCOUNT ON
SELECT
'Client' AS [Type], [Clients].[ClientID] AS [ID], NULL AS [ParentID], [Clients].[Number],
([Clients].[GivenName] + ' ' + [Clients].[FamilyName]) AS [Name]
FROM
[Clients]
LEFT OUTER JOIN [Patients] ON [Clients].[ClientID] = [Patients].[ClientID]
WHERE
@Clients = 1 AND
FREETEXT([Clients].[IndexedContent], @Query)
UNION
SELECT
'Client' AS [Type], [Clients].[ClientID] AS [ID], NULL AS [ParentID], [Clients].[Number],
([Clients].[GivenName] + ' ' + [Clients].[FamilyName]) AS [Name]
FROM
[Clients]
LEFT OUTER JOIN [Patients] ON [Clients].[ClientID] = [Patients].[ClientID]
WHERE
@Clients = 1 AND
FREETEXT([Patients].[IndexedContent], @Query)
UNION
SELECT
'Patient' AS [Type], [Patients].[PatientID] AS [ID], [Patients].[ClientID] AS [ParentID], [Patients].[Number],
([Patients].[Name] + ' ' + CASE WHEN [Breeds].[Name] IS NOT NULL THEN '(' + [Breeds].[Name] +')' ELSE CASE WHEN [Species].[Name] IS NOT NULL THEN '(' + [Species].[Name] + ')' ELSE '' END END) AS [Name]
FROM
[Patients]
LEFT OUTER JOIN [Clients] ON [Patients].[ClientID] = [Clients].[ClientID]
LEFT OUTER JOIN [Species] ON [Patients].[SpeciesID] = [Species].[SpeciesID]
LEFT OUTER JOIN [Breeds] ON [Patients].[BreedID] = [Breeds].[BreedID]
WHERE
@Patients = 1 AND
FREETEXT([Patients].[IndexedContent], @Query)
UNION
SELECT
'Patient' AS [Type], [Patients].[PatientID] AS [ID], [Patients].[ClientID] AS [ParentID], [Patients].[Number],
([Patients].[Name] + ' ' + CASE WHEN [Breeds].[Name] IS NOT NULL THEN '(' + [Breeds].[Name] +')' ELSE CASE WHEN [Species].[Name] IS NOT NULL THEN '(' + [Species].[Name] + ')' ELSE '' END END) AS [Name]
FROM
[Patients]
LEFT OUTER JOIN [Clients] ON [Patients].[ClientID] = [Clients].[ClientID]
LEFT OUTER JOIN [Species] ON [Patients].[SpeciesID] = [Species].[SpeciesID]
LEFT OUTER JOIN [Breeds] ON [Patients].[BreedID] = [Breeds].[BreedID]
WHERE
@Patients = 1 AND
FREETEXT([Clients].[IndexedContent], @Query)
UNION
SELECT
'Referal' AS [Type], [Referals].[ReferalID] AS [ID], NULL AS [ParentID], NULL AS [Number],
([Referals].[Name] + CASE WHEN [Referals].[Contact] IS NOT NULL THEN ' (' + [Referals].[Contact] + ')' ELSE '' END) AS [Name]
FROM
[Referals]
WHERE
@Referals = 1 AND FREETEXT([Referals].[IndexedContent], @Query)
UNION
SELECT
'User' AS [Type], [UserID] AS [ID], NULL AS [ParentID], NULL AS [Number],
([GivenName] + ' ' + [FamilyName]) AS [Name]
FROM
[Users]
WHERE
@Users = 1 AND FREETEXT([Users].[IndexedContent], @Query)
ORDER BY
[Name]
Thanks again - you've saved my day
December 2, 2009 at 3:03 pm
You're welcome...
Now examine the actual execution plan and see if you can still improve the code and / or indexing. For example, UNION adds a "sort" step to the execution plan, as you can see here: http://www.sqlshare.com/media.aspx?vid=389&
However, watch out - better can be sometimes the enemy of the plain "good"...:-)
December 2, 2009 at 3:14 pm
My guess is that the code corresponds to a search form - if you give more details, you can get help if you need further changes...
December 2, 2009 at 3:36 pm
Indeed it's to be used on a search form where the user can select to search in clients, patients, referals and users (hence the @Client = 1 etc.)
I have removed the index columns and am now indexing on just the nvarchar columns but that poses some problems:
Is it possible to search int columns (Client number) as well - or do I need for instance BEGIN TRY CAST(@Query AS int)... for that?
I have split the clients names in two columns GivenName and FamilyName but this means that if a guy is called 'Jens Martin' 'Olson' I don't get all results with a query like 'Jens Olson'.
Also I have replaced the FREETEXT(...) with INNER JOIN CONTAINSTABLE which works pretty cool with ranking but using INNER JOIN FREETEXTTABLE totally kills the running time again.
December 2, 2009 at 3:37 pm
Similarly a query like '"Jens" OR "Martin" OR "Olsen"' also kill the running time completely.
December 2, 2009 at 3:51 pm
Examining the actual execution plan would be really helpful - it shows you what to change (indexing, table structure), if you need any change.
Please see
http://www.sommarskog.se/dyn-search-2005.html
for solutions to challenges very similar to yours.
In your case, I think I'd choose the dynamic SQL. I'd construct a query string depending on what the user chooses on the form. And yes, you can search the IDs...
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply