April 20, 2010 at 9:58 am
I am currently doing string manipulations with T-sql. In the where statement of my code I have the following substring function:
WHERE
SUBSTRING(LineRead,2,4) IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
This produces the output that I want albeit it takes a while. I have been reading up about using EXISITS instead of IN. I just wanted to know if there is another way of rewriting this where clause.
April 20, 2010 at 10:15 am
eseosaoregie (4/20/2010)
I am currently doing string manipulations with T-sql. In the where statement of my code I have the following substring function:
WHERE
SUBSTRING(LineRead,2,4) IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
This produces the output that I want albeit it takes a while. I have been reading up about using EXISITS instead of IN. I just wanted to know if there is another way of rewriting this where clause.
The reason it takes a while is that it can't use an index if one exists. Can you add a computed column to the table with the substring values and index it?
April 21, 2010 at 12:12 am
If i add the computed colum presumably the index to create is a non clustered index?
April 21, 2010 at 7:47 am
eseosaoregie (4/21/2010)
If i add the computed colum presumably the index to create is a non clustered index?
yes it should be non clus. index
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 21, 2010 at 8:38 am
I created the computed column and the index on that column. The computed column is as follows
--Add Computed Column
ALTER TABLE MyTable
ADD comp_column AS substring(LineRead,2,4) varchar
--Create Column
CREATE INDEX IX_COMPColumn ON MyTable(comp_column)
However when I run the query it takes the same time. How can I alter the where clause syntax such that it can utilize the index on the computed column?
April 21, 2010 at 8:53 am
eseosaoregie (4/21/2010)
I created the computed column and the index on that column. The computed column is as follows
--Add Computed Column
ALTER TABLE MyTable
ADD comp_column AS substring(LineRead,2,4) varchar
--Create Column
CREATE INDEX IX_COMPColumn ON MyTable(comp_column)
However when I run the query it takes the same time. How can I alter the where clause syntax such that it can utilize the index on the computed column?
it should not contain any function like "substring" , it will force the sql optimizer NOT TO USE indexes so here
your query WHERE derived_cloumn like 'text%'
Can you also post exec plan along with table and indexes. ?
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 21, 2010 at 9:04 am
try:
comp_column IN ('mon ','tue ','wed ','thu ','fri ','sat ','sun ')
[font="Arial Narrow"]bc[/font]
April 21, 2010 at 9:34 am
my tables are:
---index
CREATE
INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
WITH
DROP_EXISTING
ON [PRIMARY]
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[Importeddate] [datetime] NULL CONSTRAINT [DF__stg_LogFi__Impor__5441852A] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)
April 21, 2010 at 10:03 am
It looks like it still needs to scan the whole table for this:
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//'
April 21, 2010 at 4:26 pm
eseosaoregie (4/21/2010)
my tables are:
---index
CREATE
INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
WITH
DROP_EXISTING
ON [PRIMARY]
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[Importeddate] [datetime] NULL CONSTRAINT [DF__stg_LogFi__Impor__5441852A] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)
Now, if you'd provide a little test data for that, you'd have an answer by now. See the first link in my signature line below for how to do that properly. Thanks.
As a side bar, it looks like you need a very simple split function to hammer all of this out... again, correctly posted test data would go a long way in helping resolve this in the most effecient manner possible.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 21, 2010 at 11:34 pm
I have now added sample data along with all the code for tables and indexes etc. Hopefully this will make things clearer
---tables
CREATE TABLE [stg_LogFiles] (
[LineRead] [varchar] (100) NULL ,
[Importeddate] [datetime] DEFAULT (getdate()),
[Firstfour] AS (substring([LineRead],2,4))
) ON [PRIMARY]
GO
CREATE TABLE [fct_Logs] (
[LogDate] [datetime] NULL ,
[Application] [int] NULL ,
[Database] [int] NULL ,
[User] [int] NULL ,
[Retrieval] [int] NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[dimApplication] (
[Application] [varchar] (5) NULL ,
[ApplicationID] [int] IDENTITY (1, 1) NOT NULL ,
[Database] [varchar] (6) NULL ,
[DatabaseID] [int] NULL
) ON [PRIMARY]
--user table
CREATE TABLE [dbo].[dimUser] (
[UserID] [int] IDENTITY (1, 1) NOT NULL ,
[UserName] [nvarchar] (10) NULL ,
[Country] [nvarchar] (20) NULL ,
[CountryID] [int] NULL
) ON [PRIMARY]
GO
--index for user table
CREATE INDEX [IX_USER] ON [dbo].[dimUser]([UserName]) ON [PRIMARY]
GO
--index for application table
CREATE INDEX [IX_Application] ON [dbo].[dimApplication]([Application]) ON [PRIMARY]
GO
--index for computed column on stg_LogFiles table
CREATE INDEX [IX_FirstFour] ON stg_LogFiles ([Firstfour])
ON [PRIMARY]
--data for application table
INSERT INTO [dimApplication]
SELECT 'GMR08','1','OLTCHK','1' UNION ALL
SELECT 'GMR09','2','OLTCHK','1' UNION ALL
SELECT 'GMR10','3','OLTCHK','1'
--data for user table
INSERT INTO [dimUser]
SELECT '1','KRRHEJ','Venezuela','43' UNION ALL
SELECT '2','SGTAYL','Venezuela','43' UNION ALL
SELECT '3','PHGANI','Venezuela','43' UNION ALL
SELECT '4','ROBAIR','Venezuela','43' UNION ALL
SELECT '5','silven','Venezuela','43' UNION ALL
SELECT '6','COAGUJ','Venezuela','43' UNION ALL
SELECT '7','USFITT','Venezuela','43' UNION ALL
SELECT '8','admin','Netherlands','21'
--data for stg_LogFiles table
INSERT INTO [dbo].[stg_LogFiles] (LineRead)
SELECT '[Mon Jan 05 13:51:05 2009]Local/GMR09///Info(1013205)' UNION ALL
SELECT '[Mon Jan 05 13:51:18 2009]Local/GMR09///Info(1013210)' UNION ALL
SELECT '[Mon Jan 05 13:51:30 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:52:59 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:53:19 2009]Local/GMR09/OTLCHK/admin/Info(1021000)' UNION ALL
SELECT '[Mon Jan 05 13:53:28 2009]Local/GMR09///Info(1013210)' UNION ALL
SELECT '[Mon Jan 05 13:53:40 2009]Local/GMR09///Info(1013210)'
---the query
INSERT INTO [ESSBASE_TEST_DW].[dbo].[fct_Logs]([LogDate], [Application], [Database], [User], [Retrieval])
(SELECT
CONVERT(datetime,SUBSTRING(B.[Date],5,7)+ RIGHT(B.[Date],4) + SUBSTRING(B.[Date],11,9),108),
A.ApplicationID,
A.DatabaseID,
U.UserID,
1
FROM
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--CHARINDEX('[',LineRead) = 1
--AND
SUBSTRING(LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)
,2) <> '//' -- ONLY SELECT records where user id is present
AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
LEFT JOIN dbo.dimApplication A ON A.Application = B.Application
LEFT JOIN dbo.dimUser U ON U.UserName = B.[User]
)
April 21, 2010 at 11:40 pm
Not all people ( including me ) can understand exec paln iin any format (here xlsm format) . i found it all impossible to read it. so always try to post it in XMl/graphical format.
Moreover, do yoo have indexes oon "application" and "users" column, which are being used in LEFT JOINs in query ??
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 21, 2010 at 11:44 pm
there are indexes on the Application and User Tables. My previous post has all the structures and sampl data which i should have posted initially.
April 22, 2010 at 12:05 am
(SELECT
SUBSTRING(LineRead,2,24) AS [Date]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)-1 - CHARINDEX('/',LineRead)) AS [Application]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)-1
- CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)) AS [Database]
,SUBSTRING(
LineRead,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1) +1,
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1)+1)+1)- 1 -
CHARINDEX('/',LineRead,CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) +1)) AS [User]
FROM dbo.stg_LogFiles
WHERE
--SUBSTRING(LineRead, CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) ,2) <> '//' AND
(Firstfour = 'mon ' or Firstfour = 'tue ' or Firstfour = 'wed ' or Firstfour = 'thu ' or Firstfour = 'fri ' or Firstfour = 'sun ' or Firstfour = 'sat ')
)B
AND
WHERE
--SUBSTRING(LineRead, CHARINDEX('/',LineRead,CHARINDEX('/',LineRead)+ 1) ,2) <> '//'
both sql are causing table scan ,
1) i tried it with clustered index replacing your indexes , found some improvement
2) "WHERE SUBSTRING" should also be avoided to force optimzer to select SEEK operation
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
April 22, 2010 at 12:17 am
How did you replace my indexes with a clustered index? Were they added to the apllication and user table?
Viewing 15 posts - 1 through 15 (of 34 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy