May 19, 2017 at 3:44 pm
In a t-sql 2012 database, I am suppose to count the number of times each 'canned' message is used for
Elementary students in the last school year and the current school year.
Right now I have the following sql that kind of works:
USE TEST
SELECT GS.Comments,Count(*) AS [Counts]
FROM dbo.Enrol Enrol
JOIN dbo.Student Student
ON Student.StudentID = Enrol.StudentID
JOIN dbo.GS GS
ON GS.StudentID = Student.Studentid
AND (GS.Comments IS NOT NULL)
AND (GS.Comments <> '')
where Enrol.grade in ('KG','01','02','03','04','05','06')
and Enrol.endYear between 2016 and 2017
Group by GS.Comments
order by Counts desc,GS.Comments asc
The problem is the GS.Comments field is defined as varchar(1200). There can be one message in the field and/or there can be lots of messages in this field. Each message ends with a period and there is a space between each message.
An example of multiple messages in the one GS.Comments field would look like the following:
The student is trying hard and needs to make their time more efficiently. This student is good at math. This student turns in their assignments on time. This student seems to enjoy school.
An example of when one messages is in the one GS.Comments field would look like the following:
This student seems to enjoy school.
Thus would showe me the t-sql 2012 logic that I can use when the GS.Comments field contains multiple messages and/or just one message so that I can count the number of times each unique message has been used?
May 19, 2017 at 4:19 pm
What do you consider "unique messages"? Do you mean something like "common sentences"? I would probably use Jeff's splitter function and write the results to at least a temporary table (better a permanent indexed table), and then this is trivial. Read Jeff Moden's article on DelimitedSplit8K.
You should know this by now, because everyone says it, but some sample data would make everyone's life a lot easier. Doesn't have to be real, just representative. CREATE TABLE script(s), INSERT script(s)?
May 19, 2017 at 4:56 pm
First for some sample data
USE tempdb
GO
IF OBJECT_ID('tempdb.dbo.cannedPhrases') IS NOT NULL DROP TABLE dbo.CannedPhrases;
IF OBJECT_ID('tempdb.dbo.GS') IS NOT NULL DROP TABLE dbo.GS;
GO
DECLARE
@c1 varchar(100) = 'This student seems to enjoy school',
@c2 varchar(100) = 'This student has so much potential',
@c3 varchar(100) = 'He takes after his mother';
SELECT Canned = x INTO dbo.CannedPhrases FROM (VALUES (@c1), (@c2), (@c3)) x(x);
SELECT CommentId = ROW_NUMBER() OVER (ORDER BY (SELECT 1)), Comment = x
INTO dbo.GS
FROM
(
SELECT REPLICATE('Blah blah blah', 10)+@c1+REPLICATE('Blah blah blah', 5) UNION ALL
SELECT @c2+REPLICATE('Blah blah blah', 20)+@c2+REPLICATE('Blah blah ', 5) UNION ALL
SELECT REPLICATE('Blah blah blah', 12)+@c3+REPLICATE('Blah blah blah', 5)+@c1 UNION ALL
SELECT REPLICATE('Blah blah blah', 100) UNION ALL
SELECT REPLICATE('Blah blah blah', 12)+@c1+'blah...'+@c2
) x(x);
Note that I stuffed 7 canned phrases in there. Next you need to get yourself a copy of NGrams8K. Here's an example of how to use NGrams8K to identify the "canned phrases and where they live in the string".
SELECT CommentId, Phrase, position
FROM
(
SELECT canned, phraseLen = LEN(canned)
FROM dbo.CannedPhrases
) AS Canned(Phrase, PhraseLen)
CROSS JOIN dbo.GS
CROSS APPLY dbo.ngrams8K(GS.Comment, PhraseLen)
WHERE token = Phrase;
This returns:
CommentId Phrase position
---------- -------------------------------------- --------------------
1 This student seems to enjoy school 141
3 This student seems to enjoy school 264
5 This student seems to enjoy school 169
2 This student has so much potential 1
2 This student has so much potential 315
5 This student has so much potential 210
3 He takes after his mother 169
And to get your counts:
SELECT Comment, Counts = COUNT(*)
FROM
(
SELECT canned, phraseLen = LEN(canned)
FROM dbo.CannedPhrases
) AS Canned(Phrase, PhraseLen)
CROSS JOIN dbo.GS
CROSS APPLY dbo.ngrams8K(GS.Comment, PhraseLen)
WHERE token = Phrase
GROUP BY Comment;
--GROUP BY ALL Comment;
-- Note: use the depreciated GROUP BY ALL to get comments with a 0 count
It's that easy.
-- Itzik Ben-Gan 2001
May 19, 2017 at 11:05 pm
wendy elizabeth - Friday, May 19, 2017 3:44 PMIn a t-sql 2012 database, I am suppose to count the number of times each 'canned' message is used for
Elementary students in the last school year and the current school year.Right now I have the following sql that kind of works:
USE TEST
SELECT GS.Comments,Count(*) AS [Counts]
FROM dbo.Enrol Enrol
JOIN dbo.Student Student
ON Student.StudentID = Enrol.StudentID
JOIN dbo.GS GS
ON GS.StudentID = Student.Studentid
AND (GS.Comments IS NOT NULL)
AND (GS.Comments <> '')
where Enrol.grade in ('KG','01','02','03','04','05','06')
and Enrol.endYear between 2016 and 2017
Group by GS.Comments
order by Counts desc,GS.Comments ascThe problem is the GS.Comments field is defined as varchar(1200). There can be one message in the field and/or there can be lots of messages in this field. Each message ends with a period and there is a space between each message.
An example of multiple messages in the one GS.Comments field would look like the following:
The student is trying hard and needs to make their time more efficiently. This student is good at math. This student turns in their assignments on time. This student seems to enjoy school.
An example of when one messages is in the one GS.Comments field would look like the following:This student seems to enjoy school.
Thus would showe me the t-sql 2012 logic that I can use when the GS.Comments field contains multiple messages and/or just one message so that I can count the number of times each unique message has been used?
Do you have a table where the "canned" comments are stored?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 20, 2017 at 8:22 pm
Yes there is a table where the canned statements are stored at. The problem is the unique identifier stored in the canned table are not passed along to the table where the statements are actually entered. One of my goals is to only find the statements that exist in the canned table and the table where the canned messages re entered. When the match occurs, I am suppose to count the number of occurences of where the canned messages are actually used.
May 20, 2017 at 8:29 pm
May 20, 2017 at 9:15 pm
Got Google?
May 21, 2017 at 11:22 am
wendy elizabeth - Saturday, May 20, 2017 8:22 PMYes there is a table where the canned statements are stored at. The problem is the unique identifier stored in the canned table are not passed along to the table where the statements are actually entered. One of my goals is to only find the statements that exist in the canned table and the table where the canned messages re entered. When the match occurs, I am suppose to count the number of occurences of where the canned messages are actually used.
Understood and expected. Can you post the CREATE TABLE for that canned messages table? I believe I can show you a simple way to do this.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 22, 2017 at 8:56 am
Thanks for showing me the sql:
Here is the table with the canned 'comments' listed below:
CREATE TABLE [dbo].[GradingCommentTemplate](
[commentID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[schoolID] [int] NULL,
[comment] [varchar](200) NULL,
[varchar](6) NULL,
CONSTRAINT [PK_GradingCommentTemplate] PRIMARY KEY NONCLUSTERED
(
[commentID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 97) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[GradingCommentTemplate] CHECK CONSTRAINT [FK_GradingCommentTemplate_School]
ALTER TABLE [dbo].[GradingCommentTemplate] WITH NOCHECK ADD CONSTRAINT [FK_GradingCommentTemplate_School] FOREIGN KEY([schoolID])
REFERENCES [dbo].[School] ([schoolID])
ON DELETE CASCADE
GO
Notes:
1. When I run the job, I will select the elementary schools by school id.
2. It is a possibility that different elementary schools have the exact same message. For this request, I do not care if they have the same comments.
3. The value for 'code' in the GradingCommentTemplate is not used anywhere in the database.
4. The identity key of CommentID is only used in this table and does not refer to any other database tables. There are lots
of tables in this database that have their commentID column with their own unique values.
Here is a copy of the school table that is referenced above as a foreign key relationship:
CREATE TABLE [dbo].[School](
[schoolID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[districtID] [int] NOT NULL,
[ncesSchoolID] [varchar](9) NULL,
[number] [varchar](7) NOT NULL,
[name] [varchar](50) NOT NULL,
[type] [varchar](5) NULL,
[comments] [varchar](255) NULL,
[address] [varchar](50) NULL,
[city] [varchar](25) NULL,
[state] [varchar](2) NULL,
[zip] [varchar](10) NULL,
[phone] [varchar](25) NULL
) ON [PRIMARY]
GO
May 22, 2017 at 1:39 pm
wendy elizabeth - Saturday, May 20, 2017 8:29 PMThe following additional items:
1. Can you explain --GROUP BY ALL Comment;
-- Note: use the depreciated GROUP BY ALL to get comments with a 0 count
to me? What am I trying to accomplish here/
2. All the values are stored in a canned table. Can you show me how to modify the sql listed above where the value is obtained from a canned table and not stuffed into a variables and the union all statement is used?
Sorry for the later reply - busy few days. I re-read your requirement and, if I understand it correctly, you are looking to count the times that phrase is used. My original solution counted how many time they appear in a comment.
Ignore my GROUP BY ALL comment; it was a quick little side-note that won't really help you solve this. Regarding the values being stored in a canned table. The code that I posted was intended to emulate what you described. Let me re-post a simplified version.
IF OBJECT_ID('dbo.cannedPhrases') IS NOT NULL DROP TABLE dbo.CannedPhrases;
IF OBJECT_ID('dbo.comments') IS NOT NULL DROP TABLE dbo.comments;
GO
DECLARE
@c1 varchar(100) = 'This student seems to enjoy school',
@c2 varchar(100) = 'This student has so much potential',
@c3 varchar(100) = 'He takes after his mother';
SELECT Canned = x
INTO dbo.CannedPhrases
FROM (VALUES (@c1), (@c2), (@c3)) x(x);
SELECT CommentId = ROW_NUMBER() OVER (ORDER BY (SELECT 1)), Comment = x
INTO dbo.comments
FROM
(
SELECT 'Blah blah blah. '+@c1+'. Blah blah blah.' UNION ALL
SELECT @c2+'. Blah blah blah. '+@c2+'. Blah blah blah.' UNION ALL
SELECT 'Blah blah blah. '+@c3+'. Blah blah blah. '+@c1 UNION ALL
SELECT 'Blah blah blah.' UNION ALL
SELECT 'Blah blah blah. '+@c1+'. blah. '+@c2
) x(x);
Now we have these two tables.
Notice that a canned phrase appears once in commentId 1, twice in commentid 2, 3 and 5. The phrase "He takes after his mother" appears once, the other two appear three times.
Here's two solutions, one which uses NGrams8K and the other that uses DelimitedSplit8K_LEAD.
-- "Splitter" solution
SELECT canned, occurances = COUNT(*)
FROM dbo.comments c
CROSS APPLY dbo.delimitedsplit8K_lead(REPLACE(Comment, '. ', char(0)), char(0)) s
JOIN dbo.CannedPhrases cp ON cp.Canned = s.Item
GROUP BY Canned;
-- NGrams8K solution
SELECT canned, occurances = COUNT(*)
FROM
(
SELECT canned, phraseLen = LEN(canned)
FROM dbo.CannedPhrases
) AS Canned(canned, PhraseLen)
CROSS JOIN dbo.comments
CROSS APPLY dbo.ngrams8K(Comment, PhraseLen)
WHERE token = canned
GROUP BY canned;
Each solution produces:
I think this is what you're looking for; or will at least get you closer.
-- Itzik Ben-Gan 2001
May 22, 2017 at 3:58 pm
Where can I access NGrams8K or DelimitedSplit8K in the sql server 2012 database?
I want to use either NGrams8K or DelimitedSplit8K as you suggested but I cannot find them in the databases I work with.
I do not see these items in the 'Custom' databases that I am working with.
I do not see these objects in the master, model, msdb, or tempdb databases probably since I do not have access for everything.
I see NGrams8K is an inline tabled valued function. Thus I was looking at table valued functions and I cannot find NGrams8K thnere.
DelimitedSplit8K is some code in the link that you pointed me to. Thus is this code that I need to setup from the link you gave to me?
Thus can you tell me how I can access NGrams8K or DelimitedSplit8K inthie sql server 2012 database that I have aqccess to?
May 22, 2017 at 4:08 pm
wendy elizabeth - Monday, May 22, 2017 3:58 PMWhere can I access NGrams8K or DelimitedSplit8K in the sql server 2012 database?
I want to use either NGrams8K or DelimitedSplit8K as you suggested but I cannot find them in the databases I work with.
I do not see these items in the 'Custom' databases that I am working with.
I do not see these objects in the master, model, msdb, or tempdb databases probably since I do not have access for everything.I see NGrams8K is an inline tabled valued function. Thus I was looking at table valued functions and I cannot find NGrams8K thnere.
DelimitedSplit8K is some code in the link that you pointed me to. Thus is this code that I need to setup from the link you gave to me?
Thus can you tell me how I can access NGrams8K or DelimitedSplit8K inthie sql server 2012 database that I have aqccess to?
go to the articles that were underlined
Here's two solutions, one which uses NGrams8K and the other that uses DelimitedSplit8K_LEAD.
for each article....I strongly suggest you read them...and then download the code from the "resources" at the bottom of the articles
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
May 22, 2017 at 9:51 pm
Jeff Moden - Sunday, May 21, 2017 11:22 AMwendy elizabeth - Saturday, May 20, 2017 8:22 PMYes there is a table where the canned statements are stored at. The problem is the unique identifier stored in the canned table are not passed along to the table where the statements are actually entered. One of my goals is to only find the statements that exist in the canned table and the table where the canned messages re entered. When the match occurs, I am suppose to count the number of occurences of where the canned messages are actually used.Understood and expected. Can you post the CREATE TABLE for that canned messages table? I believe I can show you a simple way to do this.
Looks like a couple of folks beat me to it. You posting that table helped them very well, though.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply