SQLServerCentral Article

Using Bitwise Operators to Boost Performance

,

Overview

Bitwise operators can be challenging to manage at first. However, with practice and patience, and under the right conditions, these operators can provide remarkable performance improvements in production environments. This article will compare two methods of accomplishing the same output, one with a normalized model and the other with bitwise operators. In essence, bitwise operators allow you to denormalize a portion of your database (whether you actually denormalize or not is not important).

Generally speaking, bitwise operators can provide great benefits in OLTP databases that either:

  • Suffer from too much normalization (yes, it’s possible…)
  • Run a lot of lengthy reports during business hours
  • Suffer from lots of database locks on a highly used table (with few records)
  • Need to have access to external data (in relatively small amounts)
  • Need to speed up triggers

The usual issue with bitwise operators is that they only operate with the type int (actually, they work with the bigint type as well), limiting their usefulness for large referenced tables (usually, bitwise operators can be used when the referenced table has 63 records or less). When dealing with a greater number of records, different options may be available, but potentially at an additional performance cost. These techniques include:

  • Using multiple fields to stores the referenced values
  • Using prime numbers stored as real numbers instead of binary values (this gives up to 130 records to play with instead of 63). This method requires using a formula rather than the bitwise operators
  • Using custom-developed algorithms (parsers) as either functions or extended stored procedures and store the values as string and not int

For the remainder of this article, we will use an integer field to store our binary values. The usefulness of this method is still high considering the many applications that can leverage this technique and the performance improvements that can result.

Introduction

The example we will use is that of a Users and Errors table as such (the scripts to create the tables and data can be found at the end of this article):

Users Table

Field

Type

Constraints

UserId

Int

Primary Key, Clustered, Not Null

UserName

Varchar(50)

Not Null

UserErrorFlag

Int

Not Null, Default (0)

Errors Table

Field

Type

Constraints

ErrorId

Int

Primary Key, Clustered, Not Null

ErrorDescr

Varchar(50)

Not Null

ErrorFlag

Int

Not Null

The Users table links to an Errors table that contains the list of possible errors that can occur during on a registration page. The many-to-many relationship between the Users table and the Errors table is resolved through the UserErrors table (shown below). The purpose of adding the UserErrorFlag field in the Users table is to store a mask that can help us identify which Errors a user has without reading the UserErrors table. The Errors table contains a field (ErrorFlag) that represents a power of two (1, 2, 4, 8, 16…).


The Errors table contains the following records:


Errors Table

ErrorId         

ErrorDescr

ErrorFlag

Comments

Binary Value

1

Invalid Email

1

This is 2^0

0001

2

Invalid Name

2

2^1

0010

3

Invalid Phone Number

4

2^2

0100

4

Unknown Error

8

2^3

1000

To represent a user that has both ErrorId 1 and 3, the UserErrorFlag field should contain the value 5 (sum of the error flags 1 and 4 in the Errors table). We find 5 by using the following table:

Errors Table

ErrorId         

ErrorFlag

Binary Value

Selected Binary Value

1

1

0001

0001

2

2

0010

0000

3

4

0100

0100

4

8

1000

0000

TOTAL

0101

By selecting ErrorId 1 and 3, we need to add the binary values for the corresponding flags, which is 0101 (or 5).

Normalized Example

The Users table contains 10,000 users and our Errors table contains only four records. The purpose of our example is to return the list of users that contain a specific error number (for instance ErrorId=4). We are creating a stored procedure with an input parameter for the ErrorId we need to retrieve. The procedure (called SP_Example1) should return the list of user names (unique names) that have at least one error (any error) and should indicate by a ‘Yes’ if one of the errors is equal to 4. This field would then be used by a report to display a special icon.

EXEC SP_Example1  4

The core of this stored procedure is shown below.

SELECT

            Users.UserId,
            UserName,
            ShowIcon = ISNULL((SELECT 'Yes' FROM UserErrors B WHERE B.UserId = Users.UserId AND B.ErrorId = @iError), 'No')

FROM Users LEFT JOIN UserErrors ON Users.UserId = UserErrors.UserId

 This statement returns 5,000 rows. The following is a sample of the output of SP_Example1:

UserId

UserName

ShowIcon

1

User 1

Yes

3

User 3

No

5

User 5

Yes

7

User 7

Yes

9

User 9

No

11

User 11

Yes

 

 

What is interesting to see is the query plan that SQL Server uses. As you can see, it requires a join, a Table Scan and an Index Seek on the UserErrors table. The key is to note the complexity of the query plan compared to the one using a bitwise operator (more on that later).

Our first test will be to run this procedure 100 times to enhance our visibility on performance differences. To do this, we call the following procedure:

EXEC Test1

Bitwise Example

Similarly, we create a second stored procedure (SP_Example2) that returns the same records, this time using the mask as a parameter. So instead of passing the value 4, we send 8 and execute the stored procedure as such:

EXEC SP_Example2  8

Here is the core of the stored procedure. This time, there is no need to join.

SELECT
            UserId,
            UserName,
            ShowIcon = CASE WHEN UserErrorFlag & @iFlag > 0 THEN 'Yes' ELSE 'No' End
FROM
            Users

As you can see, the bitwise operator used is & (AND). For example, to know if 5 contains 1 (0101 contains 0001), we simply execute:

SELECT CASE WHEN 5 & 1 > 0 THEN 'Yes' ELSE 'No' END         -- Result: Yes, since 0101 AND 0001 = 0001 = 1 > 0

However, 2 is not contained in 5 (0010 is not contained in 0101):

SELECT CASE WHEN 5 & 2 > 0 THEN 'Yes' ELSE 'No' END         -- Result: No, since 0101 AND 0010 = 0000 = 0

To use this stored procedure, we need to know the value of the mask to use rather than the ErrorId. This is usually not too hard since key values are stored on front-end applications. Instead of (or in addition to) storing the primary key, the application stores the mask. Since we no longer need to join on the UserErrors table to retrieve the ErrorId value, we now have a very simple T-SQL. The query plan reflects this:

Similarly, we now run this stored procedure 100 times by running the following command:

EXEC Test2

Results

The following table shows CPU cycles consumed by each test (Test1 and Test2). As you can see, the bitwise calculation clearly reduces the time it takes to obtain the records (CPU in green). The first plateau is obtained as a result of running Test1 and is about 30 seconds in length. In contrast, Test2 takes about 17 seconds to complete under the same conditions: a 40% improvement. In addition, the number of locks requested by SQL Server is also diminished drastically: from 180 down to 5 (Locks Requested by Second in Red). Using the (NOLOCK) hint would of course eliminate all locks requested by the stored procedures.

There is one more advantage to using bitwise operators: the capacity to return multiple conditions at once. For instance, if we wanted to return the same information but this time where the ErrorId is 1 and 4 we would execute the following query:

EXEC SP_Example2  9         -- Flags: 8 + 1

It turns out that this feature is extremely convenient when creating reports, since many reports require combining information. This is achieved at virtually no extra CPU cost.

The results shown above can vary greatly depending on the type of statement you are executing and the number of records in each table. The query selected for these examples reflects what a simple report would require. Also, imagine what this technique can do if you are using bitwise operators for more than one table at the same time (examples for such tables include Errors, State, Age Bracket, Access Rights and many more). You would probably need a binary field for each table, but the performance gains can be drastic.

Conclusion

Although a little complex to master, bitwise operators can serve a great purpose (and even fix some performance problems) when dealing with specific queries that can cause database slowdowns. Reports may be able to benefit greatly from this technique, especially if the reports run on an OLTP database. Using bitwise operators can give you a serious performance boost and can potentially reduce your database locks dramatically. However, implementing bitwise operators in your production environments may require great planning since front-end applications may also have to be changed to accommodate the additional mask values.

Scripts

/*Create an empty database first, and then run the following script. If you need to store more than 31 records in the reference table (Errors), use a BigInt type everywhere you store the binary values, and use the CAST function where appropriate.*/CREATE TABLE [Errors] (
[ErrorId] [int] NOT NULL ,
[ErrorDescr] [varchar] (50) NOT NULL ,
[ErrorFlag] [int] NOT NULL ,
CONSTRAINT [PK_Errors] PRIMARY KEY  CLUSTERED 
([ErrorId] )  ON [PRIMARY]) ON [PRIMARY]
GO
 
CREATE TABLE [Users] (
[UserId] [int] NOT NULL ,
[UserName] [varchar] (50) NOT NULL ,
[UserErrorFlag] [int] NOT NULL CONSTRAINT [DF_Users_UserErrorFlag] DEFAULT (0),
CONSTRAINT [PK_Users] PRIMARY KEY  NONCLUSTERED
([UserId])  ON [PRIMARY]) ON [PRIMARY]
GO
 
CREATE TABLE [UserErrors] (
[UserId] [int] NOT NULL FOREIGN KEY REFERENCES Users(UserId) ,
[ErrorId] [int] NOT NULL FOREIGN KEY REFERENCES Errors(ErrorId) ,
CONSTRAINT [PK_UserErrors] PRIMARY KEY  CLUSTERED
([UserId],[ErrorId] )  ON [PRIMARY] 
) ON [PRIMARY]
GO
 -- Create some Errors
INSERT INTO Errors VALUES (1, 'Invalid Email',1)
INSERT INTO Errors VALUES (2, 'Invalid Name',2)
INSERT INTO Errors VALUES (3, 'Invalid Phone Number',4)
INSERT INTO Errors VALUES (4, 'Internal Error',8)
 GO
-- Create user names and additional records
Declare @i int
Declare @Name varchar(10)
Declare @iErr int

Set @i = 10000 
WHILE (@i>0)
BEGIN
SET @Name = 'User ' + CAST(@i AS varchar(5))
Set @iErr = 15 -- This is the flag for all errors
If @i / 3. = @i / 3 Set @iErr = 5 -- This sets the flag for errorId 4 and 1
-- If @i is a multiple of 2, do not add an error
If @i / 2. = @i / 2 Set @iErr = 0  
INSERT INTO Users VALUES (@i, @Name, @iErr)
 If @iErr > 0
INSERT INTO UserErrors
                                SELECT @i, ErrorId
                                FROM Errors 
                                WHERE @iErr & ErrorFlag > 0
 
SET @i = @i - 1
END
GO
 
SET QUOTED_IDENTIFIER ON 
GO
SET ANSI_NULLS ON 
GO

-- SELECT UserId, 'Yes' FROM UserErrors B WHERE B.ErrorId = 4

CREATE PROCEDURE SP_Example1(@iError int)
AS
SELECTUsers.UserId, UserName, 
ShowIcon = ISNULL((SELECT 'Yes' FROM UserErrors B WHERE B.UserId = Users.UserId AND B.ErrorId = @iError), 'No')
FROMUsers LEFT JOIN UserErrors ON Users.UserId = UserErrors.UserId WHERE
UserErrors.UserId IS NOT NULL 
GROUP BYUsers.UserId, Users.UserName
ORDER BY Users.UserId

GO
SET QUOTED_IDENTIFIER OFF 
GO
SET ANSI_NULLS ON 
GO

SET QUOTED_IDENTIFIER ON 
GO
SET ANSI_NULLS ON 
GO

CREATE PROCEDURE SP_Example2(@iFlag int)
AS
SELECTUserId, UserName,
ShowIcon = CASE WHEN UserErrorFlag & @iFlag > 0 THEN 'Yes' ELSE 'No' End 
FROMUsersWHERE UserErrorFlag > 0
 
GO
SET QUOTED_IDENTIFIER OFF 
GO
SET ANSI_NULLS ON 
GO
 

CREATE PROCEDURE Test1 AS
Declare @i int
Set @i = 100

WHILE @i>0
BEGIN
EXEC SP_Example1 4
Set @i = @i-1
END
GO

CREATE PROCEDURE Test2 AS
Declare @i int
Set @i = 100
WHILE @i>0
BEGIN
EXEC SP_Example2 8
Set @i = @i-1
END
GO

Rate

5 (3)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (3)

You rated this post out of 5. Change rating