Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Retrieve data between 2 '.'


Retrieve data between 2 '.'

Author
Message
Davin21
Davin21
Old Hand
Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)Old Hand (330 reputation)

Group: General Forum Members
Points: 330 Visits: 1300
How about this?

SELECT *
FROM #testenvironment
WHERE PARSENAME(CAST(yourdata AS NVARCHAR(MAX)),3) IS NULL


Luis Cazares
Luis Cazares
SSCrazy Eights
SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)

Group: General Forum Members
Points: 8554 Visits: 18142
I was so close, thanks to the QotD that made me remember the PARSENAME function.

My code

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE PARSENAME( CAST(yourData AS NVARCHAR(MAX)),3) IS NULL
AND PARSENAME(CAST(yourData AS NVARCHAR(MAX)), 2) IS NOT NULL



Results using Cadavre's last test environment (without the XML). I deleted the "DBCC execution completed" comments.

Duration for CHARINDEX = 00:00:18:363
Duration for LEN = 00:00:09:347
Duration for PARSENAME = 00:00:06:870
Duration for COLLATE = 00:00:06:667



Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Luis Cazares
Luis Cazares
SSCrazy Eights
SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)

Group: General Forum Members
Points: 8554 Visits: 18142
Davin21 (10/25/2012)
How about this?

SELECT *
FROM #testenvironment
WHERE PARSENAME(CAST(yourdata AS NVARCHAR(MAX)),3) IS NULL



I had the same idea, but I added an extra condition because your query will return results with no period (or dot) and with more than 3.


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Cadavre
Cadavre
SSCrazy
SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)

Group: General Forum Members
Points: 2596 Visits: 8437
Luis Cazares (10/25/2012)
I was so close, thanks to the QotD that made me remember the PARSENAME function.

My code

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE PARSENAME( CAST(yourData AS NVARCHAR(MAX)),3) IS NULL
AND PARSENAME(CAST(yourData AS NVARCHAR(MAX)), 2) IS NOT NULL



Results using Cadavre's last test environment (without the XML). I deleted the "DBCC execution completed" comments.

Duration for CHARINDEX = 00:00:18:363
Duration for LEN = 00:00:09:347
Duration for PARSENAME = 00:00:06:870
Duration for COLLATE = 00:00:06:667


Very good, that was sort of the idea I was trying to implement with the XML splitter. Didn't think of PARSENAME (foolishly!).

Also, we can get rid of the "DBCC execution completed" comments by adding "WITH NO_INFOMSGS" to the script, like so: -
SET NOCOUNT ON;
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment;
END;

--1,000,000 Random rows of data
SELECT IDENTITY(INT,1,1) AS ID, CAST(LEFT(yourData, LEN(yourData) - 1) AS NTEXT) AS yourData
INTO #testEnvironment
FROM (SELECT TOP 1000000
REPLICATE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(NEWID(),'-','')
,'0',''),'9',''),'8',''),'7',''),'6',''),'5',''),'4',''),'3',''),'2',''),'1','') + '.',
(ABS(CHECKSUM(NEWID())) % 2) + 2)
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
)a(yourData);

DECLARE @HOLDER INT, @Duration CHAR(12), @StartTime DATETIME;

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM (SELECT ID, yourData, MAX(rn)
FROM (SELECT ID, yourData, split.Part.value('text()[1]', 'VARCHAR(MAX)'), ROW_NUMBER() OVER(PARTITION BY yourData ORDER BY (SELECT NULL))
FROM (SELECT CAST('<p>' + REPLACE(CAST(yourData AS VARCHAR(MAX)) COLLATE Latin1_General_BIN2,'.','</p><p>') + '</p>' AS XML),
CAST(yourData AS VARCHAR(MAX)), ID
FROM #testEnvironment) innerQ(xmlField, yourData, ID)
CROSS APPLY innerQ.xmlField.nodes('p') split(Part)
) a(ID, yourData,splitData,rn)
GROUP BY ID, yourData
) a(ID, yourData, rn)
WHERE rn = 2;

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for ugly xml split = %s',0,1,@Duration) WITH NOWAIT;

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE CHARINDEX('.', yourData, CHARINDEX('.', yourData) + 1) = 0;

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for CHARINDEX = %s',0,1,@Duration) WITH NOWAIT;

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE LEN(CAST(yourData AS NVARCHAR(MAX)))-1 = LEN(REPLACE(CAST(yourData AS NVARCHAR(MAX)) COLLATE Latin1_General_BIN2,'.',''));

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for LEN = %s',0,1,@Duration) WITH NOWAIT;

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE CHARINDEX('.', yourData COLLATE Latin1_General_BIN2, CHARINDEX('.', yourData COLLATE Latin1_General_BIN2) + 1) = 0;

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for CHARINDEX with COLLATE = %s',0,1,@Duration) WITH NOWAIT;

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE PARSENAME(CAST(yourData AS NVARCHAR(MAX)),3) IS NULL
AND PARSENAME(CAST(yourData AS NVARCHAR(MAX)), 2) IS NOT NULL;

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for PARSENAME = %s',0,1,@Duration) WITH NOWAIT;



Duration for ugly xml split = 00:00:47:197
Duration for CHARINDEX = 00:00:07:593
Duration for LEN = 00:00:08:163
Duration for CHARINDEX with COLLATE = 00:00:04:787
Duration for PARSENAME = 00:00:05:193


Luis Cazares (10/25/2012)
Davin21 (10/25/2012)
How about this?

SELECT *
FROM #testenvironment
WHERE PARSENAME(CAST(yourdata AS NVARCHAR(MAX)),3) IS NULL



I had the same idea, but I added an extra condition because your query will return results with no period (or dot) and with more than 3.


I've got the agree with Luis, it's a good idea but it could return incorrect results.


Forever trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


Craig Wilkinson - Software Engineer
LinkedIn
mickyT
mickyT
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1253 Visits: 3309
Hi

I thought I would add in a LIKE query to see how that compared. It wasn't as bad as I thought it would be.

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE yourData like '%.%' and yourData not like '%.%.%';

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for LIKE = %s',0,1,@Duration) WITH NOWAIT;

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE yourData COLLATE Latin1_General_BIN2 like '%.%' and yourData COLLATE Latin1_General_BIN2 not like '%.%.%';

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for LIKE = %s',0,1,@Duration) WITH NOWAIT;



And got the following

Duration for CHARINDEX = 00:00:09:103
Duration for LEN = 00:00:10:790
Duration for CHARINDEX with COLLATE = 00:00:06:507
Duration for PARSENAME = 00:00:07:717
Duration for LIKE = 00:00:10:703
Duration for LIKE with COLLATE = 00:00:06:517

CapnHector
CapnHector
SSC Eights!
SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)

Group: General Forum Members
Points: 917 Visits: 1789
so i went for the SQL CLR just to see if i could get it to work (been on a learning kick right now so im going with it) and here are my results.

Duration for CHARINDEX = 00:00:21:963
Duration for LEN = 00:00:16:217
Duration for CHARINDEX with COLLATE = 00:00:10:260
Duration for PARSENAME = 00:00:11:043
Duration for LIKE = 00:00:11:293
Duration for LIKE with Collate = 00:00:10:707
Duration for SQLCLR = 00:00:14:427


What i added to the run

DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;

DECLARE @HOLDER INT, @Duration CHAR(12), @StartTime DATETIME;


SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE dbo.RegexCLR(ISNULL(yourData,'')) = 1

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for SQLCLR = %s',0,1,@Duration) WITH NOWAIT;




and the code for the SQLCLR

public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction()]
public static bool RegexCLR(string input)
{
return Regex.IsMatch(input,@"^[^\.]+\.[^\.]+$");
}
};



of course when we change the NTEXT to NVARCHAR(MAX) i get the following:

Duration for CHARINDEX = 00:00:06:030
Duration for LEN = 00:00:03:253
Duration for CHARINDEX with COLLATE = 00:00:00:943
Duration for PARSENAME = 00:00:01:147
Duration for LIKE = 00:00:07:253
Duration for LIKE with Collate = 00:00:01:353
Duration for SQLCLR = 00:00:04:040


they are all really close over 1 million records.


For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2
Davio
Davio
SSC Journeyman
SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)SSC Journeyman (85 reputation)

Group: General Forum Members
Points: 85 Visits: 1187
Try the Parsename function. Not what is was made for, but works quite well for what you need and is much faster that CharIndex and Len.


Duration for CHARINDEX = 00:00:00:933
Duration for Len = 00:00:00:607
Duration for PARSENAME = 00:00:00:127



SET NOCOUNT ON;
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment;
END;

--1,000,000 Random rows of data
SELECT IDENTITY(INT,1,1) AS ID, CAST(LEFT(yourData, LEN(yourData) - 1) AS varchar(50)) AS yourData
INTO #testEnvironment
FROM (SELECT TOP 1000000
REPLICATE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(NEWID(),'-','')
,'0',''),'9',''),'8',''),'7',''),'6',''),'5',''),'4',''),'3',''),'2',''),'1','') + '.',
(ABS(CHECKSUM(NEWID())) % 2) + 2)
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
)a(yourData);


--select * from #testEnvironment where PARSENAME(yourData,3) is null --00:00:48:00

DECLARE @HOLDER INT, @Duration CHAR(12), @StartTime DATETIME;

SELECT @StartTime = GETDATE();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE CHARINDEX('.', yourData, CHARINDEX('.', yourData) + 1) = 0;

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for CHARINDEX = %s',0,1,@Duration) WITH NOWAIT;

SELECT @StartTime = SYSDATETIME();

SELECT @HOLDER = ID
FROM #testEnvironment
WHERE LEN(CAST(yourData AS NVARCHAR(MAX)))-1 = LEN(REPLACE(CAST(yourData AS NVARCHAR(MAX)) COLLATE Latin1_General_BIN2,'.',''));

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for Len = %s',0,1,@Duration) WITH NOWAIT;

SELECT @StartTime = SYSDATETIME();
select @HOLDER = ID from #testEnvironment where PARSENAME(yourData,3) is null

SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for PARSENAME = %s',0,1,@Duration) WITH NOWAIT;



Luis Cazares
Luis Cazares
SSCrazy Eights
SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)

Group: General Forum Members
Points: 8554 Visits: 18142
jdfletchr (10/26/2012)
Try the Parsename function. Not what is was made for, but works quite well for what you need and is much faster that CharIndex and Len.



Did you read the thread with the tests made and the observation on a solultion that was the same as yours?
Without an extra condition, your PARSENAME solution will throw incorrect results.
By the way, another problem is we're dealing with an ntext column.


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Eugene Elutin
Eugene Elutin
Hall of Fame
Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)

Group: General Forum Members
Points: 3042 Visits: 5478

...
they are all really close over 1 million records
...


If you want your CLR with Regex to perform well you need to declare your Regex object as static and use Compile option for the pattern. Try changing your CLR to this:

public partial class UserDefinedFunctions
{

static readonly Regex _regex = new Regex(@"^[^\.]+\.[^\.]+$", RegexOptions.Compiled);

[Microsoft.SqlServer.Server.SqlFunction()]
public static bool RegexCLR(string input)
{
return Regex.IsMatch(input);
}
};




_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help
CapnHector
CapnHector
SSC Eights!
SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)SSC Eights! (917 reputation)

Group: General Forum Members
Points: 917 Visits: 1789
Eugene Elutin (10/26/2012)

...
they are all really close over 1 million records
...


If you want your CLR with Regex to perform well you need to declare your Regex object as static and use Compile option for the pattern. Try changing your CLR to this:

public partial class UserDefinedFunctions
{

static readonly Regex _regex = new Regex(@"^[^\.]+\.[^\.]+$", RegexOptions.Compiled);

[Microsoft.SqlServer.Server.SqlFunction()]
public static bool RegexCLR(string input)
{
return Regex.IsMatch(input);
}
};








Thanks for the tip on the regex im new to C# but chose to learn that language specifically for CLR's. after the code change and rerunning the tests this is what i got with NTEXT:

Duration for CHARINDEX = 00:00:14:957
Duration for LEN = 00:00:15:723
Duration for CHARINDEX with COLLATE = 00:00:09:790
Duration for PARSENAME = 00:00:10:717
Duration for LIKE = 00:00:11:087
Duration for LIKE with COLLATE = 00:00:10:683
Duration for SQLCLR = 00:00:11:427


and now for NVARCHAR(MAX)

Duration for CHARINDEX = 00:00:06:253
Duration for LEN = 00:00:03:200
Duration for CHARINDEX with COLLATE = 00:00:01:033
Duration for PARSENAME = 00:00:01:327
Duration for LIKE = 00:00:07:237
Duration for LIKE with COLLATE = 00:00:01:457
Duration for SQLCLR = 00:00:01:953



Im actually supprised that by changing the datatype we can chop a factor of 10 off the execution times. never really saw the direct impact of data types like this before.


For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search