Is a SQL View the fastest way to manage multiple count fields, or does SQL 2014 have something better ?

  • I need to do lots of filtered counts, similar to the counting done on Ebay or Amazon. eg. Cars(34500), Car Colors - Blue(3000), Red(2000), Silver(5000) ..... etc.

    I have my standard count query:

    select count(*), item from table

    group by item

    I also added a non-clustered index based on item, which on average produces the results from one million records in 150 ms.

    Each different item count adds another approx 150 ms to the total.

    So far at five different item counts the query time is approx 750ms to 780ms.

    I need to add more counts but want to keep the query as fast as possible, is creating a SQL Indexed View for each item the only faster solution?

    http://www.sqlbadpractices.com/

    I found the above thread regarding SQL Indexed View performance on two million records, and it seems to offer a substantial performance increase.

    Is that typically how companies like Ebay and Amazon do it?

    Thanks

  • Try something like this:

    select COUNT(*) AS Cars,

    SUM(CASE WHEN Color = 'Blue' THEN 1 ELSE 0 END) AS Cars_Blue,

    SUM(CASE WHEN Color = 'Red' THEN 1 ELSE 0 END) AS Cars_Red

    from table

    -- Gianluca Sartori

  • Thanks

  • Sadly it takes 'CPU time = 811 ms, elapsed time = 817 ms' for ONE query with nine items.

    It is also a pain to type in the items names eg. States.

  • could you perhaps provide some sample table and data scripts to better understand?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • something to play around with

    SELECT TOP 1000000

    TranID = IDENTITY(INT, 1, 1),

    TypeID = 1 + CAST(Abs(Checksum(Newid()) % 999 ) AS INT),

    variant = CHAR(Abs(Checksum(Newid())) % 4 + 65)

    INTO TransData

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    CREATE NONCLUSTERED INDEX [NIC+jls] ON [dbo].[TransData] (

    [TypeID] ASC, [variant] ASC)

    GO

    --SELECT TypeID, variant, COUNT(variant) AS varcnt

    --FROM TransData

    --where typeid = 103

    --GROUP BY TypeID, variant

    SET STATISTICS TIME ON

    SELECT typeID, COUNT(*) as cnt

    , SUM(CASE WHEN variant = 'A'THEN 1 ELSE 0 END) AS varA

    , SUM(CASE WHEN variant = 'B'THEN 1 ELSE 0 END) AS varB

    , SUM(CASE WHEN variant = 'C'THEN 1 ELSE 0END) AS varC

    , SUM(CASE WHEN variant = 'D'THEN 1 ELSE 0END) AS varD

    FROM TransData

    WHERE typeid = 103

    GROUP BY typeid

    ORDER BY typeid

    SET STATISTICS TIME OFF

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Thanks for the code sample.

    I will try to get some data, and table samples tomorrow.

  • In my opinion it is not optimal to hard code the data hierarchy in this way. If an attribute is added, it would mean revisiting the code!

    The window functions can handle this very efficiently, especially when there is an POC index in place. For further details on the subject, I strongly recommend Itzik Ben-Gan's book Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions (Developer Reference).

    I did put together a little sample, first some data and then some code 😎

    USE tempdb;

    GO

    IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES

    WHERE TABLE_NAME = N'TBL_COUNT_SAMPLE'

    AND TABLE_SCHEMA = N'dbo')

    DROP TABLE dbo.TBL_COUNT_SAMPLE;

    CREATE TABLE dbo.TBL_COUNT_SAMPLE

    (

    SAMPLE_ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED NOT NULL

    ,SAMPLE_NAME NVARCHAR(128) NOT NULL

    ,SAMPLE_CATEGORY NVARCHAR(128) NOT NULL

    ,SAMPLE_TYPE NVARCHAR(128) NOT NULL

    ,SAMPLE_SUBTYPE NVARCHAR(128) NOT NULL

    ,SAMPLE_COLOUR NVARCHAR(64) NOT NULL

    );

    INSERT INTO dbo.TBL_COUNT_SAMPLE

    (

    SAMPLE_NAME

    ,SAMPLE_CATEGORY

    ,SAMPLE_TYPE

    ,SAMPLE_SUBTYPE

    ,SAMPLE_COLOUR

    )

    SELECT

    ITEM.ITEM_NAME

    ,CATEGORY.SAMPLE_CATEGORY

    ,MTYPE.SAMPLE_TYPE

    ,STYPE.SAMPLE_SUBTYPE

    ,COLOUR.SAMPLE_COLOUR

    FROM (VALUES (N'Transportation'),(N'Recreation'),(N'Sport')) AS CATEGORY(SAMPLE_CATEGORY)

    CROSS JOIN (VALUES (N'Car'),(N'Bicycle'),(N'Airplane'),(N'Motorcycle')) AS MTYPE(SAMPLE_TYPE)

    CROSS JOIN (VALUES (N'Pedal'),(N'Petrol'),(N'Electric'),(N'HDV'),(N'Diesel')) AS STYPE(SAMPLE_SUBTYPE)

    CROSS JOIN (VALUES (N'Red'),(N'Green'),(N'Blue'),(N'Yellow'),(N'Gray'),(N'Black'),(N'White')) AS COLOUR(SAMPLE_COLOUR)

    CROSS JOIN (VALUES ('Abbott-Baynes Sailplanes') ,('ABC Motors') ,('ADC Aircraft') ,('Air Navigation and Engineering Company')

    ,('Airco') ,('The Airscrew Company') ,('Airship Industries') ,('Airspeed Ltd.') ,('AJEP') ,('Armstrong Whitworth Aircraft')

    ,('Arrow Aircraft Ltd.') ,('Auster') ,('Austin Motor Company') ,('Aviation Traders') ,('Avro') ,('Avro International Aerospace')

    ,('Beagle Aircraft') ,('William Beardmore and Company') ,('Blackburn Aircraft') ,('Boulton & Paul Ltd') ,('Boulton Paul Aircraft')

    ,('Bristol Aeroplane Company') ,('British Aerial Transport') ,('British Aerospace') ,('British Aircraft Company')

    ,('British Aircraft Corporation') ,('British Aircraft Manufacturing') ,('British and Colonial Aeroplane Company')

    ,('Central Aircraft Company') ,('Chilton Aircraft') ,('Chrislea Aircraft') ,('Cierva Autogiro Company')

    ,('Civilian Aircraft Company') ,('Clayton & Shuttleworth') ,('Comper Aircraft Company') ,('Cunliffe-Owen Aircraft')

    ,('Dart Aircraft') ,('De Havilland') ,('Desoutter Aircraft Company') ,('The Dunstable Sailplane Company') ,('E. D. Abbott Ltd')

    ,('Edgar Percival Aircraft') ,('Elliotts of Newbury') ,('English Electric') ,('Fairey Aviation Company') ,('Fane Aircraft Company')

    ,('Folland Aircraft') ,('Foster, Wikner Aircraft') ,('Garland Aircraft Company') ,('General Aircraft Limited')

    ,('Gloster Aircraft Company') ,('Gosport Aircraft Company') ,('Grahame-White') ,('Handley Page') ,('Hants and Sussex Aviation')

    ,('Hawker Aircraft') ,('Hawker Siddeley') ,('Heston Aircraft Company') ,('Hewlett & Blondeau') ,('Hordern-Richmond')

    ,('Hunting Aircraft') ,('John O. Isaacs') ,('Lakes Flying Company') ,('Lang Propellers') ,('Luton Aircraft') ,('M. B. Arpin & Co.')

    ,('Mann Egerton') ,('Martinsyde') ,('Miles Aircraft') ,('Moss Brothers Aircraft') ,('Napier & Son') ,('Nash & Thomson')

    ,('National Aircraft Factory') ,('National Aircraft Factory No. 2') ,('Nieuport & General Aircraft')

    ,('Norman Thompson Flight Company') ,('Parnall') ,('Parnall & Sons') ,('Phoenix Dynamo Manufacturing Company')

    ,('Port Victoria Marine Experimental Aircraft Depot') ,('Reid and Sigrist') ,('Rollason Aircraft and Engines')

    ,('Royal Aircraft Establishment') ,('Saunders-Roe') ,('Scottish Aviation') ,('Seaplane Experimental Station') ,('Siddeley-Deasy')

    ,('Sopwith Aviation Company') ,('Spartan Aircraft Ltd') ,('Supermarine') ,('Vickers Limited') ,('Vickers-Armstrongs')

    ,('Westland Aircraft') ,('Westland Helicopters') ,('J. Samuel White')

    ) AS ITEM(ITEM_NAME);

    and then a query returning the number of types in each category

    USE tempdb;

    GO

    /* CTE Category - Type */

    ;WITH CATEGORY_TYPE AS

    (

    SELECT

    X.SAMPLE_CATEGORY

    ,X.SAMPLE_TYPE

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ORDER BY

    (SELECT NULL)

    ) AS CAT_TYPE_RID

    ,CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    FROM dbo.TBL_COUNT_SAMPLE CS

    ) AS X WHERE X.CAT_TYPE_RID = 1

    )

    /* Show count of Types in each Category */

    SELECT

    X.SAMPLE_CATEGORY

    ,X.CAT_TYPE_COUNT

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ORDER BY

    (SELECT NULL)

    ) AS CAT_TYPE_RID

    ,COUNT(CS.SAMPLE_TYPE) OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ORDER BY

    (SELECT NULL)

    ) AS CAT_TYPE_COUNT

    ,CS.SAMPLE_CATEGORY

    FROM CATEGORY_TYPE CS

    ) AS X WHERE X.CAT_TYPE_RID = 1;

    Results

    SAMPLE_CATEGORY CAT_TYPE_COUNT

    ----------------- --------------

    Recreation 4

    Sport 4

    Transportation 4

    Statistics (IO and TIME) without an POC Index

    [font="Courier New"]Table 'Worktable'. Scan count 6, logical reads 43, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TBL_COUNT_SAMPLE'. Scan count 1, logical reads 589, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 62 ms, elapsed time = 48 ms.[/font]

    Add POC Index

    USE [tempdb]

    GO

    CREATE NONCLUSTERED INDEX [NCLIDX_POC_COUNT_SAMPLE] ON [dbo].[TBL_COUNT_SAMPLE]

    (

    [SAMPLE_CATEGORY] ASC,

    [SAMPLE_TYPE] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    Statistics (IO and TIME) with an POC Index

    [font="Courier New"]Table 'Worktable'. Scan count 6, logical reads 42, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TBL_COUNT_SAMPLE'. Scan count 1, logical reads 249, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 17 ms.[/font]

  • Eirikur,

    Thanks for your suggestion and code sample.

    Heavy duty stuff, my brain hurts just looking at it 🙂

    Is that the same method used by Ebay and Amazon?

    BTW I did create SQL Indexed Views for:

    Employment Type: Full Time, Part Time, etc...

    Education: Bachelors, Masters, etc...

    States: VA, NY, etc....

    Using the query below.

    "Select count(*) from Table Group By State"

    Each one takes about 2 ms to run after the Index View creation.

    My main query also takes about 20ms to get each group of 10 records per page out of approx 750k records.

    When i run the query and search for all 'Full-Time' jobs, the execution planner suggests adding a clustered index for each of the following: Employment Type, Education and States.

    After adding the above clustered indexes, when i search for all 'Full-Time; jobs, it takes approx 0.88 seconds, to return 600k records.

    I am very surprised by this as each of the queries run on their own takes a few ms, so i thought it would take about one tenth of a second not 0.88 seconds when they were all added together.

    I looked at the execution planner but it is so complex i really don't know where the issues are and how to fix them.

    Thanks

    Ian

  • [font="Arial Black"]WHY[/font] are you returning 600K rows to begin with???:blink: And [font="Arial Black"]WHERE [/font]are you returning them to? :unsure:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My bad, the total count is 600k, records returned is only 10.

  • I have attached the plan, it seems like the clustered indexes would be faster if they were sql indexed views ?

  • Eirikur,

    I ran your sample query on my data and it took 284 ms vs my 0 ms using a SQL Indexed View.

    I will research the book for some windowing function ideas.

    Thanks

    Ian

  • Here are two more queries, somewhat self explanatory, first one is

    CATEGORY-TYPE-NAME-COLOUR-COUNT(COLOUR)

    and the second is

    CATEGORY-TYPE-NAME-COLOUR-SUBTYPE-COUNT(SUBTYPE)

    With a small set of 2000000 records, these are returning in 0.1-0.2 sec on my mediocre laptop.

    😎

    USE tempdb;

    GO

    DECLARE @CATEGORY NVARCHAR(128) = N'Sport';

    DECLARE @TYPE NVARCHAR(128) = N'Airplane';

    DECLARE @NAME NVARCHAR(128) = N'Sopwith Aviation Company';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    /* */

    SELECT

    X.SAMPLE_CATEGORY

    ,X.SAMPLE_TYPE

    ,X.SAMPLE_NAME

    ,X.SAMPLE_COLOUR

    ,X.COLOUR_COUNT

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ,CS.SAMPLE_NAME

    ,CS.SAMPLE_COLOUR

    ORDER BY

    (SELECT NULL)

    ) AS MAINETYPE_RID

    ,COUNT(CS.SAMPLE_SUBTYPE) OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ,CS.SAMPLE_NAME

    ,CS.SAMPLE_COLOUR

    ORDER BY

    (SELECT NULL)

    ) AS COLOUR_COUNT

    ,CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ,CS.SAMPLE_NAME

    ,CS.SAMPLE_COLOUR

    FROM dbo.TBL_COUNT_SAMPLE CS

    WHERE CS.SAMPLE_CATEGORY = @CATEGORY

    AND CS.SAMPLE_TYPE = @TYPE

    AND CS.SAMPLE_NAME = @NAME

    ) AS X WHERE X.MAINETYPE_RID = 1;

    USE tempdb;

    GO

    DECLARE @CATEGORY NVARCHAR(128) = N'Sport';

    DECLARE @TYPE NVARCHAR(128) = N'Airplane';

    DECLARE @NAME NVARCHAR(128) = N'Sopwith Aviation Company';

    SET STATISTICS IO ON;

    SET STATISTICS TIME ON;

    /* */

    SELECT

    X.SAMPLE_CATEGORY

    ,X.SAMPLE_TYPE

    ,X.SAMPLE_NAME

    ,X.SAMPLE_COLOUR

    ,X.SAMPLE_SUBTYPE

    ,X.COLOURSTYPE_COUNT

    FROM

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ,CS.SAMPLE_NAME

    ,CS.SAMPLE_COLOUR

    ,CS.SAMPLE_SUBTYPE

    ORDER BY

    (SELECT NULL)

    ) AS MAIN_RID

    ,CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ,CS.SAMPLE_NAME

    ,CS.SAMPLE_COLOUR

    ,CS.SAMPLE_SUBTYPE

    ,COUNT(CS.SAMPLE_SUBTYPE) OVER

    (

    PARTITION BY

    CS.SAMPLE_CATEGORY

    ,CS.SAMPLE_TYPE

    ,CS.SAMPLE_NAME

    ,CS.SAMPLE_COLOUR

    ,CS.SAMPLE_SUBTYPE

    ORDER BY

    (SELECT NULL)

    ) AS COLOURSTYPE_COUNT

    FROM dbo.TBL_COUNT_SAMPLE CS

    WHERE CS.SAMPLE_CATEGORY = @CATEGORY

    AND CS.SAMPLE_TYPE = @TYPE

    AND CS.SAMPLE_NAME = @NAME

    ) AS X WHERE X.MAIN_RID = 1;

  • isuckatsql (5/10/2014)


    I ran your sample query on my data and it took 284 ms vs my 0 ms using a SQL Indexed View.

    Then you don't have to worry:w00t:

    😎

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply