Parsing Parameters in a Stored Procedure

  • dbuendiab (8/10/2010)


    jcboyer-1091017 (8/10/2010)


    I have been using an XML string to achieve the same results. Not sure that it is faster, but it is quite flexible as:

    EXEC sp_xml_preparedocument @idoc OUTPUT, @XMLParameters

    insert into #T

    select *

    FROM OPENXML (@idoc, '/OrderGroup/Item', 1)

    WITH #T

    EXEC sp_xml_removedocument @idoc

    In this example #T is a large table with over 20 columns. @XMLParameters is varchar(max) to pass as many rows as necessary at once, which is and added bonus, because there is still no Table as parameter. (I am using varchar instead of XML because XML is more strict and we do not need this here.)

    Comments on this approach are welcome.

    Jerome

    Hello Jerome!

    As I have stated in others posts, XML was not an option for me (SQL2K)

    um... I have some xml stuff that's been running in sql2k for years.

    -- parse the xml

    EXEC sp_xml_prepareDocument @xmlTable OUTPUT, @fileContent;

    -- insert the xml into the msgQueUTSin table

    INSERT msgQueUTSin (fileName, dateQueued, externalSysUTS_id, spar06, spar07, spar08, spar09 )

    SELECT @currentFileName, GETDATE(), 1, *

    FROM

    OPENXML (@xmlTable, '/workorders/workorder/part',2)

    WITH (

    spar06 varchar(50) '../priority',

    spar07 varchar(50) '../line',

    spar08 varchar(50) '@index',

    spar09 varchar(255) '@description',

    );

    The xml looks like this:

    <?xml version="1.0" encoding="UTF-8" standalone='yes'?>

    <workorders>

    <workorder><priority>4</priority><line>0001</line><part index="myPartName" description="asdf CONNECTOR" /></workorder>

    </workorders>

  • I'm surprised that nobody has yet pointed to the excellent DelimitedSplit8K function, developed here on this web site. (Actually, I bet Jeff's getting ready to showcase it later tonight!)

    I used to advocate the XML version... until this blew it away.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Jay Hopping (8/10/2010)


    dbuendiab (8/10/2010)


    jcboyer-1091017 (8/10/2010)


    I have been using an XML string to achieve the same results. Not sure that it is faster, but it is quite flexible as:

    EXEC sp_xml_preparedocument @idoc OUTPUT, @XMLParameters

    insert into #T

    select *

    FROM OPENXML (@idoc, '/OrderGroup/Item', 1)

    WITH #T

    EXEC sp_xml_removedocument @idoc

    In this example #T is a large table with over 20 columns. @XMLParameters is varchar(max) to pass as many rows as necessary at once, which is and added bonus, because there is still no Table as parameter. (I am using varchar instead of XML because XML is more strict and we do not need this here.)

    Comments on this approach are welcome.

    Jerome

    Hello Jerome!

    As I have stated in others posts, XML was not an option for me (SQL2K)

    um... I have some xml stuff that's been running in sql2k for years.

    -- parse the xml

    EXEC sp_xml_prepareDocument @xmlTable OUTPUT, @fileContent;

    -- insert the xml into the msgQueUTSin table

    INSERT msgQueUTSin (fileName, dateQueued, externalSysUTS_id, spar06, spar07, spar08, spar09 )

    SELECT @currentFileName, GETDATE(), 1, *

    FROM

    OPENXML (@xmlTable, '/workorders/workorder/part',2)

    WITH (

    spar06 varchar(50) '../priority',

    spar07 varchar(50) '../line',

    spar08 varchar(50) '@index',

    spar09 varchar(255) '@description',

    );

    The xml looks like this:

    <?xml version="1.0" encoding="UTF-8" standalone='yes'?>

    <workorders>

    <workorder><priority>4</priority><line>0001</line><part index="myPartName" description="asdf CONNECTOR" /></workorder>

    </workorders>

    Hi, Jay:

    You're right, I didn't remember OPENXML. Still I would not switch to a XML-formed parameter passing for several reasons - being the main the way users like to get their information. They feel comfortable with a comma-separed argument. They get external data from Excel, writing so simple SQL as:

    Exec dbo.GenericSP 'TX,CA,WA', 'STB,RDSI'

    to get a table with the data from TX, CA, WA referred to the STB; RDSI services.

    Why bother them with the nuisances of the XML syntax? Of course they can, but... Do I have to deal with complex data structures, by other side well managed by XML? If not, how do I tell them they have to open-tag, close-tag, well-form XML data for something they can do with ordinary separators? I don't see major benefits and, as I said, XML was kind of exotic thing when I began using SQL and still it is for many people.

    (After saying that, let me add I have no personal fight against XML. I like CSS!!)

    I think XML is good (although too verbose) for storing complex data structures. I agree it is basic in the development of open API architectures like Google or Twitter. And I can see SQL Servers widens its scope in the new versions giving a major development to its XML support. But otherwise I don't think of XML as the Holy Grial. Commas still have a place under the sun.

  • jhood (8/10/2010)


    I see what you are saying Jeff, but would you want to do validation if the application did not allow the end user to send invalid data? perhaps using a drop down list.

    Kevin G. Boles

    Did you find the link you were thinking about, I would love read the massive discussion .

    Hi Kevin:

    In the current scenario, peoples download data from the database using a stored procedure. They invoke it thru Excel, External Data. The lack of validation lies on this: If they misspell a state, they won't get data for it. They'll get aware soon, no doubt. Thats why it's not important to reveal the wrong code; the only relevant question is that wrong data won't get the SP down.

    For me is a major point to avoid duplicates as I use the temp table obtained to INNER JOIN with the main data table.

  • I quite agree that XML is no panacea(?). In my shop, we deal with what we call moving specs and XML have proven very robust when adding items that were not planned before.

    The Excel data can be exported as XML... the fact it is verbose is also an asset, as user can see (if not necessarily comprehend) the data being passed.

    In .NET, we have a function called SPLIT that separates items based on a separator. There is a simple way to implement a CLR to run as dll and return a table based on the splitted parameters.

    This thread has proven quite useful and I did not realize how passing parameters would trigger such a nice discussion. Someone suggested that we use the most appropriate (elegant) startegy to implement passing 'table like' parameters. I agree as I see no easy one fits all solution.

  • dbuendiab (8/10/2010)


    jhood (8/10/2010)


    I see what you are saying Jeff, but would you want to do validation if the application did not allow the end user to send invalid data? perhaps using a drop down list.

    Kevin G. Boles

    Did you find the link you were thinking about, I would love read the massive discussion .

    Hi Kevin:

    In the current scenario, peoples download data from the database using a stored procedure. They invoke it thru Excel, External Data. The lack of validation lies on this: If they misspell a state, they won't get data for it. They'll get aware soon, no doubt. Thats why it's not important to reveal the wrong code; the only relevant question is that wrong data won't get the SP down.

    For me is a major point to avoid duplicates as I use the temp table obtained to INNER JOIN with the main data table.

    For me an important point is to protect the user and the data and that lack of validation certainly does nothing to teach the user of his/her mistake nor protect the data. The user could quite literally go forever without ever realizing that there's no data for a particular state that they're inputting for or that they've misspelled an abbreviation.

    So... let me ask. If no revealing the wrong code (as you say) actually meant that YOU would miss out on a $20 extra bit of money in your paycheck, would you feel so cavalier about letting bad data into the parameter without making the user aware?

    There's no such thing as "partially right". Either it's right or it's not. Not catching bad inputs in the wrong thing to do.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • dbuendiab (8/10/2010)


    I think XML is good (although too verbose) for storing complex data structures.

    I agree... it's way too verbose. It clogs the pipe with 6 to 16 times(1) more bytes for each file transfer, takes 6 to 16 times more bytes to store, and takes a whole lot more horsepower and time to shred. Files that used to be only 1GB in size have exploded to 6GB to 16GB for the stuff I receive.

    CSV's and TSV's can be just as complex with virtually no overhead when compared to XML but they sometimes do require a little planning. That's why I think a lot of people like XML. No planning required.

    _________________

    (1) My personal experience over time. Your "mileage" will vary depending on the nature of your business and the types of data transfers you do.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • dbuendiab (8/10/2010)


    The performance, being a question sometimes, it's not a point here:

    You know, it's really odd that you say that. It's really odd that ANYONE would say that. :blink: Check the forums on this site. It's full of posts by people asking how to improve the performance of their code. Considering how many people say that perfomance is "not a point here", I'm thinking some of them are having performance problems now. ๐Ÿ˜‰

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jhood (8/10/2010)


    I see what you are saying Jeff, but would you want to do validation if the application did not allow the end user to send invalid data? perhaps using a drop down list.

    What is to guarantee that the application is actually correct even if there's a drop down list? Are you going to personally review and test that code? Even if the application is absoluely perfect, what is to guarantee that only that application will be used to feed the stored procedure? What if someone changes the application without figuring out the impact down stream? What if line noise or a dropped connection causes a partial or garbled transmission?

    There are too many things that can go wrong between a "trusted" application and the database. If the data is worth having, it's worth validating. Heh... that means it's all worth validating or you wouldn't send it to the database to begin with. ๐Ÿ˜‰

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My apologies... I got out of work quite late tonight and only had time for some verbal replies. I'll try to get to the coded replies which will include some speed tests for a lot of the methods included in this thread including the one from the article. As Pappy says, "One test is worth a thousand expert opinions".

    Thanks for waiting.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I tried to check time taken by the suggested SQL on my machine.

    I have a table containing 10404147 rows. Searched on a varchar column, which is not a part of index. So always Table Scan happened.

    The structure of table as follows:

    CREATE TABLE [dbo].[Answer](

    [AnsID] [int] IDENTITY(1,1) NOT NULL,

    [QuesID] [int] NOT NULL,

    [AnswerOption] [nvarchar](255) NOT NULL,

    [Correct] [bit] NOT NULL

    ) ON [PRIMARY]

    The above table has CLUSTERED INDEX on QuesID and NONCLUSTERED INDEX on AnsID

    select * from answer

    It has been returned in 2:27.020 mins returning 10404147 rows from 10404147 rows in total.

    select * from answer where answeroption = 'Answers options are here000010'

    It has been returned in 0:22.127 mins returning 27000 rows from 10404147 rows in total.

    declare @param varchar(100)

    set @param = ',Answers options are here000010,Answers options are here000012,Answers options are here000014,Answers options are here000015,'

    select * from answer where charindex(',' + AnswerOption + ',', @param) > 0

    It has been returned in 2:12.542 mins returning 81000 rows from 10404147 rows in total.

    In all above cases, table scan happened.

    Above case has been executed on SQL Server 2005 Express installed on my desktop (Intel Core 2 Duo E8400 @ 3 GHz with 2GB RAM)

    Thanks,

    Nizam

  • Really a very nice article to spare with. But , In case of tables containing large amounts of data , i think it's not a good practice to use CharIndex. instead you can simply use a table valued function which can return you the desired result set

    CREATE FUNCTION [dbo].[udf_GetStringToTable]-- 'JJHJ,JJJJJJJJJ' ,','

    (

    @StringData VARCHAR(MAX)

    , @Delineator VARCHAR(1)

    )

    RETURNS @ResultTable TABLE

    (

    ColumnID VARCHAR(MAX)

    )

    AS

    BEGIN

    DECLARE @ColumnID VARCHAR(MAX)

    , @Pos INT

    SET @StringData = LTRIM(RTRIM(@StringData))+ @Delineator

    SET @Pos = CHARINDEX(@Delineator, @StringData, 1)

    IF REPLACE(@StringData, @Delineator, '') <> ''

    BEGIN

    WHILE @Pos > 0

    BEGIN

    SET @ColumnID = LTRIM(RTRIM(LEFT(@StringData, @Pos - 1)))

    IF @ColumnID <> ''

    BEGIN

    INSERT INTO @ResultTable (ColumnID)

    VALUES (@ColumnID)

    END

    SET @StringData = RIGHT(@StringData, LEN(@StringData) - @Pos)

    SET @Pos = CHARINDEX(@Delineator, @StringData, 1)

    END

    END

    RETURN

    END

  • CirquedeSQLeil (8/10/2010)


    Thanks for the article.

    I think one must be careful about the string parsing method used. I don't believe there is a one size fits all method for all scenarios. I have used a different version here and there depending on the needs (one was CTE based, and another uses the XML)

    As I was not pretending to show a general-use technique, I must agree with you. This method is simple and (for me) elegant - but only if it fits your requeriments.

  • For me an important point is to protect the user and the data and that lack of validation certainly does nothing to teach the user of his/her mistake nor protect the data. The user could quite literally go forever without ever realizing that there's no data for a particular state that they're inputting for or that they've misspelled an abbreviation.

    There's no such thing as "partially right". Either it's right or it's not. Not catching bad inputs in the wrong thing to do.

    Don't be angry with me, Jeff. Perhaps I wrote too hastily, or maybe with the biasing of my particular environment. I do work on several DSS databases. My users are very (jealous? distruthful? pain of translators...) about required data. Data, moreover, are stable, so I assume that if they misspell something and do not get data, they'll be the first to realize. Thus my main concern is that their errors do not affect my procedure (as it would with duplicated values).

    After this kind of apology, you'll see is not difficult to implement a mechanism to detect the wrong values, as you can see in the following (ad-hoc) code:

    -- Let's say we have this clearly misspelled parameters ----

    declare @param varchar(200)

    set @param = 'WA,CA,Not_A-Valid_Param,TX,XX'

    -- Here the reference table of valid values ----

    declare @states table (

    state char(2)

    )

    -- Only a excerpt ----

    insert @states values ( 'WA' )

    insert @states values ( 'CA' )

    insert @states values ( 'TX' )

    -- Let's add delimiters at the beginning and end of the string ----

    set @param = ',' + @param + ','

    -- Now the piece of code to get the

    values ----

    -- I added the orderby column to make the algorithm

    -- sensible to the input order (if it matters) ----

    declare @param_table table (

    state char(2),

    orderby int

    )

    insert @param_table

    select state, charindex( ',' + state + ',', @param ) as orderby

    from @states

    -- Ad-hoc method to get incorrect values: ----

    -- Suppress the valid ones from the input string ----

    select @param = replace( @param, state, '' )

    from @param_table

    order by orderby

    -- For the sake of comparison, I create what would be the valid string ----

    declare @s-2 varchar(800)

    set @s-2 = ''

    select @s-2 = @s-2 + ',' + state

    from @param_table

    order by orderby

    -- Get rid of repeated commas ----

    while ( charindex( ',,', @param ) > 0 )

    set @param = replace( @param, ',,', ',' )

    -- Good parameters to the right of the Father, the bad to the left ---

    -- (Excuse me for the extra commas - quick and lazy programming after all) ----

    select @s-2 As GOOD_PARAM, @param AS BAD_PARAM

    Finally, I disagree with you about right or wrong things. In my opinion, circumstances determine how good a solution is; what fits in some cases, does not in others.

    Nevertheless, your opinion is valuable to me, although a little rough for my taste. Sorry if I offended you, anyway.

  • Jeff Moden (8/10/2010)


    dbuendiab (8/10/2010)


    The performance, being a question sometimes, it's not a point here:

    You know, it's really odd that you say that. It's really odd that ANYONE would say that. :blink: Check the forums on this site. It's full of posts by people asking how to improve the performance of their code. Considering how many people say that perfomance is "not a point here", I'm thinking some of them are having performance problems now. ๐Ÿ˜‰

    Jeff:

    At the outset I have to tell you it's a bit intimidating (and exciting) for me to have these posts with you, when so far I had only admired and taken into account your writtings.

    I am a novice writer, but I have enough experience as a SQL Developer ... and I have never resigned myself to write inefficient code.

    One of the keys of my article was to highlight that SQL has unsuspected ways to escape the procedural programming, which is sometimes hard to see when you start with it.

    But to the point: performance is important for me, but I tend to consider it's worth working on it when needed.

Viewing 15 posts - 46 through 60 (of 124 total)

You must be logged in to reply to this topic. Login to reply