TSQL - Aggregate Bitwise

  • Greetings all - I have a query I need to write and was wondering if anyone could give me a suggestion on how to accomplish this. I understand
    that it can be accomplished by writing a CLR aggregate function but given the circumstances, but I don't think I can implement this on the customers
    server. I could also most likely also use a CURSOR I guess but was hoping for a cleaner method.

    Given this Data:

    Date            BitValue
    -------------    --------
    '01/01/1017'    1
    '01/01/1017'    3
    '01/02/1017'    4
    '01/02/1017'    6
    '01/03/1017'    2
    '01/03/1017'    1

    I would like to use a bitwise logical OR and group by day, so that the results look like this: 

    01/01/17     3
    01/02/17     6
    01/03/17     3

    Note that I  don't want sums,  but true logical ORs where "1 | 1 = 1" and "1 | 3 = 3" and the bit mask in reality is so large it uses a BIGINT to hold the value.

    Any suggestions or help would be appreciated, thanks for you time.

    Todd

  • What is the full logic here. I was trying to fathom from your figures, and thought that you took the max value, which works for 01 and 02, but not for 03. How do you get these numbers? Why does 1 & 3 = 3, 4 & 6 = 6, but 2 & 1 = 3?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thom A - Friday, January 20, 2017 12:22 PM

    What is the full logic here. I was trying to fathom from your figures, and thought that you took the max value, which works for 01 and 02, but not for 03. How do you get these numbers? Why does 1 & 3 = 3, 4 & 6 = 6, but 2 & 1 = 3?

    We are not looking at the values in relation to size but which bits are set. The general rules for is as far as bits themselves go are:
    1 or 1 = 1
    0 or 0 = 0
    1 or 0 = 1

    Todd

  • Todd Young-359443 - Friday, January 20, 2017 12:28 PM

    We are not looking at the values in relation to size but which bits are set. The general rules for is as far as bits themselves go are:
    1 or 1 = 1
    0 or 0 = 0
    1 or 0 = 1

    Todd

    A case statement? For example:
    CREATE TABLE #BIT (ID INT IDENTITY(1,1),
           BitValue1 BIT,
           BitValue2 BIT);
    GO

    INSERT INTO #BIT
    VALUES (1,1),
           (1,0),
           (0,1),
           (0,0);
    GO

    SELECT *,
           CASE WHEN BitValue1 = 1 OR BitValue2 = 1 THEN 1 ELSE 0 END AS OrValue,
           CASE WHEN BitValue1 = 0 AND BitValue2 = 0 THEN 0
                WHEN BitValue1 = 1 AND BitValue2 = 1 THEN 0
                ELSE 1 END AS XorValue
    FROM #BIT;

    DROP TABLE #BIT;
    GO

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • I got this working with the Quirky update. To learn more about how to make this work read the following article: http://www.sqlservercentral.com/articles/T-SQL/68467/
    Be sure to understand how it works before implementing this solution. A mistake could create wrong data.

    --Start of Sample data
    CREATE TABLE #Sample(
      Date  date,
      BitValue  int
    );
    INSERT INTO #Sample
    VALUES
      ('20170101', 1),
      ('20170101', 3),
      ('20170201', 4),
      ('20170201', 6),
      ('20170301', 2),
      ('20170301', 1),
      ('20170301', 128);
    --End of Sample Data

    --Pass table to a clustered table
    SELECT *
    INTO #Aggregation
    FROM #Sample

    CREATE CLUSTERED INDEX CI_Aggregation ON #Aggregation(Date);

    --Quirky Update
    DECLARE @Date date, @BitValue int;

    UPDATE a WITH(TABLOCKX)
    SET @BitValue = BitValue = BitValue | CASE WHEN @Date = Date THEN @BitValue ELSE BitValue END,
      @Date = Date
    FROM #Aggregation a
    OPTION( MAXDOP 1);

    --Get results
    SELECT Date, MAX(BitValue) BitValue
    FROM #Aggregation
    GROUP BY Date;

    --Remove clustered table
    DROP TABLE #Aggregation

    GO
    --Clean sample data
    DROP TABLE #Sample

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • maybe you could use a version of the quirky update often used for running totals, using a temp table with an extra aggVal column?

    CREATE TABLE #bits (itemDate DATE, itemVal BIGINT, aggVal BIGINT);
    CREATE CLUSTERED INDEX ix_bits ON #bits(itemDate);

    INSERT INTO #bits
        (itemDate, itemVal, aggVal)
    VALUES
        ('01/01/2017',1,0),('01/01/2017',3,0),('01/02/2017',4,0),('01/02/2017',6,0),('01/03/2017',2,0),('01/03/2017',1,0);

    -- quirky update
    DECLARE @agg bigint = 0, @prev_date date = '1900-01-01';
    UPDATE #bits SET
        @agg = aggVal = CASE WHEN itemDate = @prev_date THEN @agg | itemVal ELSE itemVal END,
        @prev_date = itemDate
      OPTION (MAXDOP 1);

    SELECT itemDate, MAX(aggVal) FROM #bits
      GROUP BY itemDate
      ORDER BY itemDate;

    edit: d'oh!  Luis beat me too it!

  • Thank you Luis and Chris. I think this should work and I appreciate your assistance. This site is full of knowledgeable
    and helpful people !!

    If anyone is interested, I posted this article a few years ago on how to use a bit mask to grab any of week combination
    from a date column.

    http://www.sqlservercentral.com/scripts/Date+Manipulation/71363/

  • The problem with using the quirky update is that it's, well, quirky.  It's not guaranteed to work.  Here is a method that uses guaranteed methods and performs only slightly slower on this sample data.

    ;
    WITH CTETally AS
    (
        SELECT n
        FROM (VALUES(1), (1), (1), (1), (1), (1)) n(n)
    )
    , bits AS
    (
        SELECT TOP(31) POWER(CAST(2 AS BIGINT), ROW_NUMBER() OVER(ORDER BY @@VERSION) - 1) AS bitmask
        FROM CTETally A
        CROSS JOIN CTETally AS b
    )
    , bitmasks AS
    (
        SELECT s.[Date], s.BitValue & b.bitmask AS BitValues
        FROM #Sample s
        INNER JOIN bits b
            ON s.BitValue & b.bitmask = b.bitmask
        GROUP BY s.[Date], s.BitValue & b.bitmask
    )
    SELECT [Date], SUM(BitValues)
    FROM bitmasks
    GROUP BY [Date]
    ORDER BY [Date]

    I haven't tested this on a large dataset.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Todd Young-359443 - Friday, January 20, 2017 12:17 PM

    My suggestion is not to use SQL this way. We never intended for it to be replacement for assembly language; it is a high level data oriented language based on abstractions. Your immediately engaged in high-end versus low-end, one's complement twos complement, and other hardware considerations. You will have code that cannot be optimized, will not port and completely defeats the purpose of having a high level language.

    https://www.simple-talk.com/sql/t-sql-programming/bit-of-a-problem/

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • drew.allen - Monday, January 23, 2017 12:58 PM

    The problem with using the quirky update is that it's, well, quirky.  It's not guaranteed to work.  Here is a method that uses guaranteed methods and performs only slightly slower on this sample data.

    Drew

    That's a nice option. It might be faster if you reduce the number of rows from the bits CTE depending on the actual needs or simply keeping them as constants. I wonder if this would be simpler for the engine and that would make it faster.


    WITH bits AS
    (
      SELECT bitmask
      FROM (VALUES( 1),( 2),( 4),( 8),
          ( 16),( 32),( 64),(128)) bits(bitmask)
    )
    SELECT s.[Date], SUM( DISTINCT b.bitmask) AS BitValues
    FROM #Sample s
    INNER JOIN bits b ON s.BitValue & b.bitmask = b.bitmask
    GROUP BY s.[Date]
    ORDER BY s.Date;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares - Tuesday, January 24, 2017 9:14 AM

    drew.allen - Monday, January 23, 2017 12:58 PM

    The problem with using the quirky update is that it's, well, quirky.  It's not guaranteed to work.  Here is a method that uses guaranteed methods and performs only slightly slower on this sample data.

    Drew

    That's a nice option. It might be faster if you reduce the number of rows from the bits CTE depending on the actual needs or simply keeping them as constants. I wonder if this would be simpler for the engine and that would make it faster.


    WITH bits AS
    (
      SELECT bitmask
      FROM (VALUES( 1),( 2),( 4),( 8),
          ( 16),( 32),( 64),(128)) bits(bitmask)
    )
    SELECT s.[Date], SUM( DISTINCT b.bitmask) AS BitValues
    FROM #Sample s
    INNER JOIN bits b ON s.BitValue & b.bitmask = b.bitmask
    GROUP BY s.[Date]
    ORDER BY s.Date;

    Yes, I wasn't sure whether the OP was using INT or BIGINT.  The biggest cost, though, was the DISTINCT.  I also didn't realize that you could use DISTINCT with SUM, so that would probably save costs right there.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Luis Cazares - Tuesday, January 24, 2017 9:14 AM

    drew.allen - Monday, January 23, 2017 12:58 PM

    The problem with using the quirky update is that it's, well, quirky.  It's not guaranteed to work.  Here is a method that uses guaranteed methods and performs only slightly slower on this sample data.

    Drew

    That's a nice option. It might be faster if you reduce the number of rows from the bits CTE depending on the actual needs or simply keeping them as constants. I wonder if this would be simpler for the engine and that would make it faster.


    WITH bits AS
    (
      SELECT bitmask
      FROM (VALUES( 1),( 2),( 4),( 8),
          ( 16),( 32),( 64),(128)) bits(bitmask)
    )
    SELECT s.[Date], SUM( DISTINCT b.bitmask) AS BitValues
    FROM #Sample s
    INNER JOIN bits b ON s.BitValue & b.bitmask = b.bitmask
    GROUP BY s.[Date]
    ORDER BY s.Date;

    Nice

    Luis Cazares - Tuesday, January 24, 2017 9:14 AM

    drew.allen - Monday, January 23, 2017 12:58 PM

    The problem with using the quirky update is that it's, well, quirky.  It's not guaranteed to work.  Here is a method that uses guaranteed methods and performs only slightly slower on this sample data.

    Drew

    That's a nice option. It might be faster if you reduce the number of rows from the bits CTE depending on the actual needs or simply keeping them as constants. I wonder if this would be simpler for the engine and that would make it faster.


    WITH bits AS
    (
      SELECT bitmask
      FROM (VALUES( 1),( 2),( 4),( 8),
          ( 16),( 32),( 64),(128)) bits(bitmask)
    )
    SELECT s.[Date], SUM( DISTINCT b.bitmask) AS BitValues
    FROM #Sample s
    INNER JOIN bits b ON s.BitValue & b.bitmask = b.bitmask
    GROUP BY s.[Date]
    ORDER BY s.Date;

    Very nice solution Luis. Since I already have a table with the bit-masks, I was able to condense the entire thing
    into a single query with no CTE tables.

    Thanks !!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply