Search query based on four separate parameters.

  • Hello all,

    After searching for possible solutions, I haven't found anything that provides me with what I'm after. I'm fairly new to SQL, so apologies for the newbie question. I have been given the task of providing a search tool (via SSRS) within a dataset that is comprised of a number of fields. The search tool itself will be based on four parameters: identifier, forename, surname and Date. Please see the original search code below, which is too restrictive. What I need, is a search query that is based on ALL the combinations possible from the four search parameters.

    How many possible combinations do you think there will be? Am I right in saying 15?

    Is it possible to rewrite the code that encapsulates all these combinations in a concise way?

    Any help would be greatly appreciated.

    ORIGINAL QUERY:

    SELECT Source_System, Identifier, Forename, Surname, Date etc....

    FROM tblTable

    WHERE

    (Identifier = @identifier

    OR Date = @date

    OR Forename LIKE '%'+@for+'%'

    OR Surname LIKE '%'+@sur+'%')

    SAMPLE DATASET

    USE [DATABASE]

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[AAATempTable1](

    [Identifier] [INT] NOT NULL,

    [Source_System] [VARCHAR](30) NOT NULL,

    [Date] [DATETIME] NOT NULL,

    [Forename] [VARCHAR](30) NOT NULL,

    [surname] [VARCHAR](30) NOT NULL,

    PRIMARY KEY CLUSTERED

    (

    [Identifier] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    USE [DATABASE]

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (1111, N'A', CAST(N'1798-02-01T00:00:00.000' AS DateTime), N'Dummy', N'Test')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (2222, N'A', CAST(N'1799-02-01T00:00:00.000' AS DateTime), N'Dummy1', N'Test1')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (3333, N'A', CAST(N'1800-02-01T00:00:00.000' AS DateTime), N'Dummy2', N'Test2')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (4444, N'B', CAST(N'1801-02-01T00:00:00.000' AS DateTime), N'Dummy', N'Test')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (5555, N'B', CAST(N'1799-02-01T00:00:00.000' AS DateTime), N'Dummy3', N'Test3')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (6666, N'B', CAST(N'1799-02-01T00:00:00.000' AS DateTime), N'Dummy4', N'Test4')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (7777, N'C', CAST(N'1804-02-01T00:00:00.000' AS DateTime), N'Dummy', N'Test')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (8888, N'C', CAST(N'1798-02-01T00:00:00.000' AS DateTime), N'Dummy5', N'Test5')

    GO

    INSERT [dbo].[AAATempTable1] ([Identifier], [Source_System], [Date], [Forename], [surname]) VALUES (9999, N'C', CAST(N'1798-02-01T00:00:00.000' AS DateTime), N'Dummy6', N'Test6')

    GO

  • this is well worth a read......http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Thanks for the quick response. I'll give that article a read now.

  • I noticed that the link posted uses 'sp_executesql'. Is there any other method that uses a process that is more secure? This search function is to be web based and requires a certain level of security.

  • mack53nico (12/1/2016)


    I noticed that the link posted uses 'sp_executesql'. Is there any other method that uses a process that is more secure? This search function is to be web based and requires a certain level of security.

    did you read all of the article?

    ......towards the bottom she says "Note that there’s no SQL injection vulnerability in this. The parameters are never concatenated into the string and the execution is parametrised."

    or use RECOMPILE

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I read everything including the link to 'dynamic search conditions' and then wrote the code. It was when it was ready to deploy that my boss pointed out that isn't a fan of that command, that it is insecure and easily hacked. I work within the health sector and the datasets contain a lot of sensitive information....so you can imagine the panic that ensued!

  • Might be a little confusion there. EXEC 'some query' is very easily hacked. What Gail teaches in that article is not.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Right I see what you mean. Looks like wires have been crossed somewhere along the line.

    Cheers for the advice.

  • mack53nico (12/1/2016)


    I noticed that the link posted uses 'sp_executesql'. Is there any other method that uses a process that is more secure?

    sp_executesql is not intricately insecure.

    my boss pointed out that isn't a fan of that command, that it is insecure and easily hacked.

    Your boss is mistaken.

    It's not sp_executesql that's insecure. It's the practice of concatenating untrusted user input into a string and executing it.

    And besides, if you're on SQL 2016, there's no need to use dynamic SQL for this problem. Scroll down to the bottom of that blog post and look for the (brief) mention of OPTION(RECOMPILE)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the feedback, I'll look into OPTION(RECOMPILE).

    He may be wrong, but I won't be the one to tell him that πŸ˜€

  • mack53nico (12/1/2016)


    He may be wrong, but I won't be the one to tell him that πŸ˜€

    You said you're new, but I strongly recommend you do tell him. His response will tell you a lot about your future.

    1. Do you really want to work for a manager that won't consider your expertise and/or can't admit when he is wrong/doesn't know about something? That is a big red flag for me.

    2. If he understands SQL Injection enough to be concerned about it, I expect he'll be able to understand how it is mitigated by the option presented. However, I am an optimist. πŸ™‚ It is also good for you to learn enough about it to be able to explain it to him.

    3. If he is a good manager, you will increase your value and respect in his eyes because he can trust you not to just turtle up when you see a problem. Most managers of technical teams have a subtle fear of what their staff are not telling them. After all, they have to explain to management when catastrophe does happen. I have been the go-to person for past managers because they know they'll get a straight answer from me.

    4. You will be happier with your final product because you'll know you did it right. There is an intangible benefit to building solutions you're proud of.

    Wes
    (A solid design is always preferable to a creative workaround)

  • Appreciate the advice, however working for the NHS you need for be VERY careful with data security. Confidentiality is king around here and any breach is taken very seriously. So I guess he's just being wary with regards to the possible risks.

    I will mention to him that nothing is being concatenated though and will see how that goes. πŸ™‚

  • Regarding the OPTION RECOMPILE:

    It would help only, when you change your WHERE to something as

    WHERE (ISNULL(@ForumName, '') <> '' AND tbo.ForumName LIKE @ForumName + '%')

    OR (ISNULL(@SurName, '') <> '' AND tbo.SurNameLIKE @SurName+ '%')

    OR ...

    -> You have to check for empty parameters so that the compiler could get rid of them.

    Another way would be to quadruble the select and put only the not-empty-parameter into the WHERE (and check on application side which query you have to use).

    Third way (particularly, when more than one filter parameter could exists):

    WITH q1 AS (SELECT pk_id FROM tbl WHERE col1 = @param1),

    q2 AS (SELECT pk_id FROM tbl WHERE col2 = @param2),

    q3 AS (SELECT pk_id FROM tbl WHERE col3 = @param3),

    q4 AS (SELECT pk_id FROM tbl WHERE col4 = @param4),

    cte_or AS ( SELECT pk_id FROM q1

    UNION SELECT pk_id FROM q2

    UNION SELECT pk_id FROM q3

    UNION SELECT pk_id FROM q4

    ),

    cte_and AS (

    SELECT pk_id FROM q1

    INNER JOIN q2 ON q2.pk_id = q1.pk_id

    INNER JOIN q3 ON q3.pk_id = q1.pk_id

    INNER JOIN q4 ON q4.pk_id = q1.pk_id

    )

    SELECT tbl.*

    FROM cte_and

    INNER JOIN tbl

    ON tbl.pk_id = cte_and.pk_id

    if the parameters should be applied with AND or OR you have to use cte_and or cte_or in the outer query. Benefit of this query type could be the indexing, if every parameter is selectiv and its very random which parameters are used.

    God is real, unless declared integer.

  • mack53nico (12/1/2016)Is it possible to rewrite the code that encapsulates all these combinations in a concise way?

    SELECT Source_System, Identifier, Forename, Surname, Date etc....

    FROM tblTable

    WHERE

    (Identifier = @identifier

    OR Date = @date

    OR Forename LIKE '%'+@for+'%'

    OR Surname LIKE '%'+@sur+'%')

    You are not too far off.

    WHERE

    (Identifier = @identifier OR @identifier IS NULL)

    AND ( Date = @date OR @date IS NULL)

    AND ( Forename LIKE '%'+@for+'%' )

    AND ( Surname LIKE '%'+@sur+'%')

    This assumes

    - @identifier and @date are non-string datatypes and cannot contain blanks.

    - Forename & Surname are not nullable and can contain blanks. If they are nullable, you'll need to add the null test logic depending on the different meaning you'd have in place for null vs blank.

    The biggest problem you'll run into with this type of WHERE clause is that it nearly always leads to a table scan. Although not ideal, table scans are not the end of the world. However, if the table you're filtering is also being joined to a number of other tables, your query may end up taking until the end of the world to return because the optimizer doesn't understand what you want. Even option (recompile) statements won't help because the WHERE clause is not sargable.

    In that scenario, I like to use a filter first approach. It will look something like this:

    SELECT pk

    INTO #primarytable

    FROM primarytable

    WHERE(Identifier = @identifier OR @identifier IS NULL)

    AND ( Date = @date OR @date IS NULL)

    AND( Forename LIKE '%'+@for+'%' )

    AND ( Surname LIKE '%'+@sur+'%')

    SELECT *

    FROM #primarytable AS keyfilter

    INNER JOIN primarytable AS pt

    ON pt.pk = keyfilter.pk

    INNER JOIN [all other tables]...

    WHERE [clauses not touching the primary table]...

    OPTION (RECOMPILE)

    The benefit is that your non-sargable query and table scan are targeted to a single table and the optimizer doesn't have to figure out a plan that works with all of the other tables you're joining. Using the Recompile in the second query allows the optimizer to create a plan appropriate for the number of rows that matched the primary tables filters.

    The join on PKs between the temp table and the primary table is a very high performance index seek and is almost guaranteed to be one of the first operations, which will reduce the number of rows that the rest of the plan operations have to touch. I've used this with great success when writing theses type of reporting queries from the TFS database (millions of rows, 10-20 joins, potentially dozens of filters).

    There are tradeoffs, such as forced tempdb usage and touching a single table multiple times, but I've found these to be acceptable over a slow running query that will likely spill to tempdb anyway.

    On a separate note, a concise query doesn't necessarily mean high performance. Many times, the quick way to write a SQL script generates slower code. Aim for performance and maintainability. I feel abbreviations (parameter/table/column name, etc) hurt maintainability. They generally make maintenance harder because the next developer has to interpret your abbreviation. I would use @forename and @surname in your example.

    Wes
    (A solid design is always preferable to a creative workaround)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply