Group and Count from a table

  • I have a following table and I need to group by each topic and count them.

    SNo Category

    1 ------- XML/HTML

    2 ------- ASP.NET

    3 ------- C#

    4 ------- ASP.NET/C#

    5 ------- C#/XML

    6 ------- HTML/ASP.NET

    7 ------- SQL

    8 ------- SQL/HTML

    9 ------- SQL/XML

    10 ------- XML

    11 ------- C#

    12 ------- ASP.NET

    13 ------- SQL

    14 ------- XML

    15 ------- SQL

    Expected Result:

    Category Count

    C# -------- 4

    XML -------- 5

    HTML ------- 3

    ASP.NET --- 4

    SQL ------- 5

  • With Jeff Moden help ( dbo.DelimitedSplit8K string split function - http://www.sqlservercentral.com/articles/Tally+Table/72993/ )

    you can do just this:

    declare @mytable table (SNo int, Category varchar(100))

    insert @mytable

    select 1, 'XML/HTML'

    union select 2, 'ASP.NET'

    union select 3, 'C#'

    union select 4, 'ASP.NET/C#'

    union select 5, 'C#/XML'

    union select 6, 'HTML/ASP.NET'

    union select 7, 'SQL'

    union select 8, 'SQL/HTML'

    union select 9, 'SQL/XML'

    union select 10, 'XML'

    union select 11, 'C#'

    union select 12, 'ASP.NET'

    union select 13, 'SQL'

    union select 14, 'XML'

    union select 15, 'SQL'

    select d.Item as Category, COUNT(*) AS Count

    from @mytable t

    cross apply dbo.DelimitedSplit8K (Category, '/') d

    group by d.Item

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thank you SSCrazy for your quick reply.

    Now, I got the following error:

    "Category" is not a recognized table hints option. If it is intended as a parameter to a table-valued function or to the CHANGETABLE function, ensure that your database compatibility mode is set to 90."

  • Well, Eugene beat me to it and with something better than what I put together.

    --Next time provide some DDL like so....

    --1) Lets setup that test data

    IF OBJECT_ID('tempdb..#classes') IS NOT NULL

    DROP TABLE #classes;

    CREATE TABLE #classes (SNo int primary key, category varchar(20) NOT NULL);

    INSERT INTO #classes

    SELECT1,'XML/HTML' UNION ALL

    SELECT2,'ASP.NET' UNION ALL

    SELECT3,'C#' UNION ALL

    SELECT4,'ASP.NET/C#' UNION ALL

    SELECT5,'C#/XML' UNION ALL

    SELECT6,'HTML/ASP.NET' UNION ALL

    SELECT7,'SQL' UNION ALL

    SELECT8,'SQL/HTML' UNION ALL

    SELECT9,'SQL/XML' UNION ALL

    SELECT10,'XML' UNION ALL

    SELECT11,'C#' UNION ALL

    SELECT12,'ASP.NET' UNION ALL

    SELECT13,'SQL' UNION ALL

    SELECT14,'XML' UNION ALL

    SELECT15,'SQL'

    -- The query:

    ;WITH cat1 AS

    (SELECT category FROM #classes WHERE CHARINDEX('/',category)=0),

    cat2 AS

    (SELECT category FROM #classes WHERE CHARINDEX('/',category)<>0),

    catall AS

    (SELECT category FROM cat1

    UNION ALL

    SELECT SUBSTRING(Category,1,CHARINDEX('/',Category)-1) FROM cat2

    UNION ALL

    SELECT SUBSTRING(Category,CHARINDEX('/',Category)+1,LEN(category)) FROM cat2)

    SELECT category, count(category) AS [count]

    FROM catall

    GROUP BY category

    -- be nice to your dba

    DROP TABLE #classes

    GO

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • romah (3/14/2013)


    Thank you SSCrazy for your quick reply.

    Now, I got the following error:

    "Category" is not a recognized table hints option. If it is intended as a parameter to a table-valued function or to the CHANGETABLE function, ensure that your database compatibility mode is set to 90."

    Can you post what you're running to get this error?

    Which Sql Server version are you using?

    What is Category in your case?

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Actually I copied your code and also a function

    declare @mytable table (SNo int, Category varchar(100))

    insert @mytable

    select 1, 'XML/HTML'

    union select 2, 'ASP.NET'

    union select 3, 'C#'

    union select 4, 'ASP.NET/C#'

    union select 5, 'C#/XML'

    union select 6, 'HTML/ASP.NET'

    union select 7, 'SQL'

    union select 8, 'SQL/HTML'

    union select 9, 'SQL/XML'

    union select 10, 'XML'

    union select 11, 'C#'

    union select 12, 'ASP.NET'

    union select 13, 'SQL'

    union select 14, 'XML'

    union select 15, 'SQL'

    select d.Item as Category, COUNT(*) AS Count

    from @mytable t

    cross apply dbo.fnSplit(Category, '/') d

    group by d.Item

    When I tried with your code, I got following error:

    "Category" is not a recognized table hints option. If it is intended as a parameter to a table-valued function or to the CHANGETABLE function, ensure that your database compatibility mode is set to 90.

    Also tried to use my table TestCategory

    select d.Item as Category, COUNT(*) AS Count

    from TestCategory t

    cross apply dbo.fnSplit(Category, '/') d

    group by d.Item

    When I did like this, I got syntax error 'Category'

    My table is very simple as I already provided.

    SNo---Category

    1 ------- XML/HTML

    2 ------- ASP.NET

    3 ------- C#

    4 ------- ASP.NET/C#

    5 ------- C#/XML

    6 ------- HTML/ASP.NET

    7 ------- SQL

    8 ------- SQL/HTML

    9 ------- SQL/XML

    10 ------- XML

    11 ------- C#

    12 ------- ASP.NET

    13 ------- SQL

    14 ------- XML

    15 ------- SQL

  • Awesome Alan's code works perfect !!!

    I changed "#classes" to my table name "TestCategory" and it gave the result that I expected.

    Thank you Alan. and also thank you Eugene Elutin.

    How to accept that post as answer?

  • Note, dbo.DelimitedSplit8K is the fastest way to split sting in T-SQL (it only will marginally loose to CLR implementation). Also, what will happen if your Category will include 3 or more skills: HTML/ASP.NET/C#?

    You will need to amend Alan's solution in order to deal with this (making it slower). Using DelimitedSplit8K splitter, you don't need to worry about this and scalability.

    Try to make it working. I guess you have some other object in your database called Category. Try to put table alias at front of Category inside of function:

    select d.Item as [Category], COUNT(*) AS Count

    from @mytable t

    cross apply dbo.fnSplit(t.[Category], '/') d

    group by d.Item

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I tried and I got another error for both @mytable and TestCategory

    Error:

    Msg 102, Level 15, State 1, Line 4

    Incorrect syntax near '.'.

    And I tried with select after cross apply. Again got the same error.

    select D.Item as [Category], COUNT(*) AS Count

    from dbo.TestCategory as T

    cross apply

    (

    select * from dbo.fnSplit(T.[Category],'/')

    ) as D

    group by D.Item

    I test the fnSplit() function and it's returning correct values

    select * from dbo.fnSplit('asp.net/c#/xml/html','/')

    Returns

    Item Number---Item

    1----------------asp.net

    2----------------c#

    3----------------xml

    4----------------html

  • I found in other forums that "cross apply" only works if the compatibility_level is greater than 80.

    When I checked the compatibility_level of our database, it's 80.

    SELECT compatibility_level

    FROM sys.databases WHERE name = 'database_name'

    I think Eugene's solution works in greater compatibility-level but I don't have permission to change it.

    Thanks once again Eugene !:-)

  • romah (3/14/2013)


    I found in other forums that "cross apply" only works if the compatibility_level is greater than 80.

    When I checked the compatibility_level of our database, it's 80.

    SELECT compatibility_level

    FROM sys.databases WHERE name = 'database_name'

    I think Eugene's solution works in greater compatibility-level but I don't have permission to change it.

    Thanks once again Eugene !:-)

    NP. I am, however, concerned that Eugene is correct about the future. Please post your fnSplit function (ironically named) and lets see if we can make this code a bit more bullte proof for the future scalability..

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes, Eugene's solution worked with compatibility_level=100 when I tested in my local machine.

    select D.Item as [Category], COUNT(*) AS Count

    from dbo.TestCategory as T

    cross apply

    (

    select * from dbo.fnSplit(T.[Category],'/')

    ) as D

    group by D.Item

    As Eugene's earlier reply, I copied the DelimitedSplit8K() function from Jeff Moden's (your) article http://www.sqlservercentral.com/articles/Tally+Table/72993/ and renamed as fnSplit() in my database. 🙂

    ALTER FUNCTION dbo.fnSplit

    --===== Define I/O parameters

    (@pString VARCHAR(8000), @pDelimiter CHAR(1))

    --WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    --===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...

    -- enough to cover VARCHAR(8000)

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    ),

    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

    SELECT 1 UNION ALL

    SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter

    ),

    cteLen(N1,L1) AS(--==== Return start and length (for use in substring)

    SELECT s.N1,

    ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)

    FROM cteStart s

    )

    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),

    Item = SUBSTRING(@pString, l.N1, l.L1)

    FROM cteLen l

    ;

    Once again Thank you all for helping me 🙂

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply