CHARINDEX Not Finding String

  • EDIT: Removed previous quote as I quoted the wrong one...

    DBAgal (10/3/2011)


    jared-709193 (10/3/2011)


    Also, SQL Server will return the error you are getting if even 1 row returns an invalid argument for the substring. So, if one of your strings does not contain '</t2>' and you try to subtract from 0, that will give an invalid length parameter.

    Try this:

    SELECT CHARINDEX('</t2>', data)

    FROM TableName

    WHERE CHARINDEX('</t2>', data) < 4

    If any data is returned, that is your first problem.

    Thanks,

    Jared

    Rows are returned for the query, but I verified that their is indeed a </t2> tag. Hmmm...

    The presence of the tag is not the issue then, it is that when you subtract 4 or CHARINDEX('<t2>',data) from it, it returns an invalid parameter for the substring. Try writing a CASE statement to look for CHARINDEX('</t2>',data) < 4 and handle that separately.

    Thanks,

    Jared

    Jared
    CE - Microsoft

  • Lowell (10/3/2011)


    it's pretty well documented that SQL will keep data for a TEXT datatype in a varchar(8000) as long as it's less than 8000 chars, otherwise it's stored a different way,and then cannot be accessed directly...you have to use textpointers and that makes it a lot harder to fiddle with;

    so some rows are shorter than 8000 chars and can be searched via substrings, and the longer rows cannot.

    add a DATALENGTH(Data) to your results, and see which how large the results are when you are successful...i'm sure they are under 8K in characters

    converting to varchar(max) lets you get them all.

    Am I mistaken that all text, ntext, image, nvarchar(max), varchar(max), varbinary(max), and xml data will be put onto their own page irrespective of length?

    http://msdn.microsoft.com/en-us/library/ms190969.aspx

    Thanks,

    Jared

    Jared
    CE - Microsoft

  • DROP TABLE #Tablename

    CREATE TABLE #Tablename (ID INT, [Data] text)

    INSERT INTO #Tablename (ID, [Data])

    SELECT 1, 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>'

    union all

    SELECT 2, 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry<t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>'

    -- this will return all of the string following '<t2>' if no closing '</t2>' is found

    -- 8000 assumes VARCHAR(8000) but appears to work with text datatype.

    --In practice you might want to use a sensible figure to capture the string you want without too much trailing rubbish

    SELECT id,

    x.Startpos, -- for show

    x.Endpos, -- for show

    [Length] = ISNULL(NULLIF(x.Endpos,0),8000)-x.Startpos, -- for show

    SUBSTRING(Data, x.Startpos, ISNULL(NULLIF(x.Endpos,0),8000)-x.Startpos)

    FROM #Tablename

    CROSS APPLY (SELECT startpos = 4+CHARINDEX('<t2>', Data,1), endpos = CHARINDEX('</t2>', Data,1)) x


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • PEOPLE!

    If this query:

    SELECT CHARINDEX('</t2>', data)

    FROM TableName

    WHERE CHARINDEX('</t2>', data) < 4

    returns ANY data, (which it has already stated it does) the substring will fail every time on the data set because of the -4. Either put in a case statement, or if this should not be possible in your data, you must identify these rows and fix it.

    This is not about data type or length.

    Jared

    Jared
    CE - Microsoft

  • jared-709193 (10/3/2011)


    PEOPLE!

    If this query:

    SELECT CHARINDEX('</t2>', data)

    FROM TableName

    WHERE CHARINDEX('</t2>', data) < 4

    returns ANY data, (which it has already stated it does) the substring will fail every time on the data set because of the -4. Either put in a case statement, or if this should not be possible in your data, you must identify these rows and fix it.

    This is not about data type or length.

    Jared

    The substring doesn't fail in my query even if the start and end markers don't exist in the string - however, rows where there are no marers should still be excluded:

    DROP TABLE #Tablename

    CREATE TABLE #Tablename (ID INT, [Data] text)

    INSERT INTO #Tablename (ID, [Data])

    SELECT 1, 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>'

    union all

    SELECT 2, 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry<t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>'

    union all

    SELECT 3,'<t2>'

    UNION ALL

    SELECT 4,'No marker'

    -- this will return all of the string following '<t2>' if no closing '</t2>' is found

    -- 8000 assumes VARCHAR(8000) but appears to work with text datatype.

    --In practice you might want to use a sensible figure to capture the string you want without too much trailing rubbish

    SELECT id,

    x.Startpos, -- for show

    x.Endpos, -- for show

    [Length] = ISNULL(NULLIF(x.Endpos,0),8000)-x.Startpos, -- for show

    SUBSTRING(Data, x.Startpos, ISNULL(NULLIF(x.Endpos,0),8000)-x.Startpos)

    FROM #Tablename

    CROSS APPLY (SELECT startpos = 4+CHARINDEX('<t2>', Data,1), endpos = CHARINDEX('</t2>', Data,1)) x

    WHERE x.Startpos > 4


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (10/3/2011)


    jared-709193 (10/3/2011)


    PEOPLE!

    If this query:

    SELECT CHARINDEX('</t2>', data)

    FROM TableName

    WHERE CHARINDEX('</t2>', data) < 4

    returns ANY data, (which it has already stated it does) the substring will fail every time on the data set because of the -4. Either put in a case statement, or if this should not be possible in your data, you must identify these rows and fix it.

    This is not about data type or length.

    Jared

    The substring doesn't fail in my query even if the start and end markers don't exist in the string - however, rows where there are no marers should still be excluded:

    DROP TABLE #Tablename

    CREATE TABLE #Tablename (ID INT, [Data] text)

    INSERT INTO #Tablename (ID, [Data])

    SELECT 1, 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>'

    union all

    SELECT 2, 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry<t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>'

    union all

    SELECT 3,'<t2>'

    UNION ALL

    SELECT 4,'No marker'

    -- this will return all of the string following '<t2>' if no closing '</t2>' is found

    -- 8000 assumes VARCHAR(8000) but appears to work with text datatype.

    --In practice you might want to use a sensible figure to capture the string you want without too much trailing rubbish

    SELECT id,

    x.Startpos, -- for show

    x.Endpos, -- for show

    [Length] = ISNULL(NULLIF(x.Endpos,0),8000)-x.Startpos, -- for show

    SUBSTRING(Data, x.Startpos, ISNULL(NULLIF(x.Endpos,0),8000)-x.Startpos)

    FROM #Tablename

    CROSS APPLY (SELECT startpos = 4+CHARINDEX('<t2>', Data,1), endpos = CHARINDEX('</t2>', Data,1)) x

    WHERE x.Startpos > 4

    Yes, and depending on the requirements and data, these 2 queries may be more efficient or not:

    SELECT id,

    CASE

    WHEN CHARINDEX('</t2>', data) < 4 THEN ''

    ELSE SUBSTRING(data, CHARINDEX('<t2>', data)+4, (CHARINDEX('</t2>', data)- CHARINDEX('<t2>', data))-4)

    END

    FROM #Tablename

    SELECT id,SUBSTRING(data, CHARINDEX('<t2>', data)+4, (CHARINDEX('</t2>', data)- CHARINDEX('<t2>', data))-4)

    FROM #Tablename

    WHERE CHARINDEX('</t2>', data) >= 4

    My point is that this thread (including myself) was ignoring the information that was given to us in the form of an error. It was known that an invalid parameter was being passed to the substring function. The first step we should have taken was to determine if it was the data. A good start would have been to break up the substring first into its values:

    SELECT id,

    CHARINDEX('<t2>', Data)+4, CHARINDEX('</t2>', Data)- CHARINDEX('<t2>', Data))-4

    FROM Tablename

    Now with this query we can quickly scan the results for any values that would make our substring fail. If the data set is too large, then we could limit it to what we know about substrings.

    SELECT id,

    CHARINDEX('<t2>', Data)+4, CHARINDEX('</t2>', Data)- CHARINDEX('<t2>', Data))-4

    FROM Tablename

    WHERE CHARINDEX('</t2>', Data)- CHARINDEX('<t2>', Data))-4 <=0

    Once we have a better understanding of the data, we can come up with many different ways to query this data. However, by assuming it was a length, data type, or version issue we skipped the first step we should have done. After determining how to handle the issue above, we may still encounter an issue with length. If we handled the length issue first, though, we would still be stumped as to why it was not working.

    Thanks,

    Jared

    Jared
    CE - Microsoft

  • Use the 3rd parameter of CHARINDEX to ensure that the </t2> you are finding is AFTER the <t2> that you find.

    e.g.

    CHARINDEX('</t2>',stringtosearch,CHARINDEX('<t2>',stringtosearch))

    --

    JimFive

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply