SQLServerCentral Article

Azure Stretch Database, Part I: Getting Going

,

Whenever you see Microsoft demonstrate any of its nifty new technologies, you would think they are so easy that a lightly-trained hamster could fire them all up in five minutes. It never turns out this way in real life. Microsoft purposefully builds demos to give the easiest path to a showy result. Then you get into the real world, and often find yourself plodding through “instructions” that appear as if Bletchley Park cryptologists wrote them as some sort of cruel joke. As you plod through roadblock after roadblock, you invent whole new classes of profanity. You desperately hope to devise a solution that works and is usable in production.

I just lived through this with Azure Stretch Database. It was actually not as bad as other “nifty new” Microsoft inventions, but please be aware of some things. Here’s a start to finish tour of Azure Table Stretching in two parts. In Part One, we deal with the essentials of setting up Stretch Database. Part Two deals with real-world challenges for the DBA to operate with Azure Stretch Database.

Forget the Wizard!

The Enable Database for Stretching wizard looks as if it holds out some real promise to make your life easier. However, the Wizard has some real drawbacks.

First, the Wizard doesn’t generate scripts. So, while it can accomplish the result, it cannot give you a script that would enable you to follow any cogent process for deploying it into an enterprise. It’s a black box. If you want to follow any kind of reasonable deployment process, you need scripts. These scripts could also show you how the process works and reduce the learning curve. How this could be omitted is quite beyond me. The wizard is essentially a toy for the junior DBA to experiment without actually learning how to implement the technology. Please use T-SQL as shown below. You won’t regret it.

Second, the Wizard will tell you (after a long delay) that table X is not suitable for stretching, and give a somewhat useful message that explains why. The usual reason is that the table uses a check or default constraint; both are not allowed in stretched tables. Another reason is that the table is the parent of (referenced by) a foreign key relationship. There are more constraints that I don’t list here but which you can find through Internet research. However, you can get all of the same data more rapidly from T-SQL calls.

Setting Up Stretching

There are 8 steps, each explained below.

Enable Remote Data Archive

You first need to enable stretching. This is a simple configuration option. Like this:

sp_configure 'remote data archive',1
go
reconfigure
go

The remote data archive configuration isn't even an advanced option, so you can simply run this if you have the permissions.

Create an Azure SQL Database Server

Now, create an Azure SQL Database to accomplish this; it is the easiest way to do so. This is best done in the Azure portal.

Create Azure SQL Database Create SQL Database in Azure.

Create or open a Master Key

You must create a master key (or open an existing one) to stretch the database. Why? This is needed to encrypt the credential that stretching uses to access the remote data in the Azure SQL Stretch database. Create or open a master key on each stretch-enabled database.

-- If you don't already have a master key then do this:
CREATE MASTER KEY ENCRYPTION BY PASSWORD= 'Nice-Long-complex-password-kept-safely-in-your-vault';
-- If you have an existing master key, then open it like this:
OPEN MASTER KEY DECRYPTION BY PASSWORD= 'Nice-Long-complex-password-kept-safely-in-your-vault';

Create a Database Scoped Credential

With the master key in place and open, you create the scoped credential. This holds the credential that will access the remote Azure SQL database server. Create a credential with the user name and password that you created when you set up the Azure SQL Database server.

-- In the same query window where you created or opened the master key:
CREATE DATABASE SCOPED CREDENTIAL StretchArticlePassword
WITH 
  IDENTITY = 'IDGivenInAzure'
  ,SECRET = 'Complex-secure-password-given-to-azure-and-stored-in-the-vault';
GO

Enable the Database for Stretching

Now you enable the database for stretching. This will create the Stretch database on the Azure SQL Database server, and will link the scoped credential to the stretching.

use [master]
GO
ALTER DATABASE [Stretchy]
    SET REMOTE_DATA_ARCHIVE = ON  
        (  
            SERVER = 'StretchLikeARubberBand.database.windows.net' ,  
            CREDENTIAL = StretchArticlePassword  
        ) ;  
GO

Determine the Stretching Criteria and Establish Predicate Functions

Next, you need to create stretch predicates. Without them, all rows of a stretched table will end up remote in Azure. Table Stretching uses filtering functions to determine which rows go to the cloud and which stay local. If the function returns a non-empty result set based on the passed-in column(s) value(s), then the row is sent to the cloud.

Great! We can test ORDER_DATE for GETDATE()-90 and send all of those rows to the cloud!  Hold up there Ricky Bobby!  The function can’t have a non-deterministic value in its where clause. In English, your threshold date must be hard coded like this:

CREATE FUNCTION dbo.fn_stretchDateTime20190610(@column1 datetime) 
RETURNS TABLE 
WITH SCHEMABINDING  
AS  
    RETURN  SELECT 1 AS is_eligible 
        WHERE @column1 < Convert(datetime,'06/10/2019',101)
;

How to implement a sliding threshold

A hard-coded threshold date. Good Grief!  That’s about as useful as a refrigerator in Antarctica. Now, to make it worse, the stretch predicate function is schema-bound and cannot be altered with a new date once attached to the stretch for one or more tables. So what to do?  Are we stuck or can we implement a sliding threshold to move rows to the cloud as they age?  We can. Achieve the needed effect like this:

Create a new function with the new threshold date.

BIG CAVEAT!  You can’t attach a new function to the stretch if it would bring rows back from the cloud. So you can’t go from archiving  rows that are 30 days old and older to 60 days old and older without un-stretching the table and re-stretching it (which incurs data egress charges).

Now going the other way, such as  60 to 30 days, is fine. You also cannot add or remove columns from a multi-column test. I can feel the tornado-force wind from everyone’s combined facepalms.

Attach the function to the table like this:

ALTER TABLE [StretchedTable] SET 
( 
REMOTE_DATA_ARCHIVE = ON 
        (  
                FILTER_PREDICATE = dbo.fn_stretchDateTime20190611(TestThisDate),  
                MIGRATION_STATE = OUTBOUND  
        ) 
);
?

Yes, this looks identical to the syntax to add a table to Stretch Database. It is.

Last, drop the old function.

Automating stretch predicates

You can also create a job that will generate the filter functions and then assign them. The script below does this:

declare @sSQL nvarchar(max) = ''
        ,@NewThreshold15 datetime = convert(date,dateadd(dd,-15,getdate()))
        ,@NewThreshold15AsInt int = -1
        ,@NewFnName nvarchar(255) = ''
select
        @NewThreshold15AsInt = cast(convert(nvarchar(8),@NewThreshold15,112) as int)
        ,@NewFnName = 'dbo.fnStretchDateTime' + convert(nvarchar(10),@NewThreshold15AsInt);
select @sSQL = 'CREATE FUNCTION ' + @NewFnName + '(@column1 datetime) 
RETURNS TABLE 
WITH SCHEMABINDING  
AS  
    RETURN  SELECT 1 AS is_eligible 
        WHERE @column1 < Convert(datetime,''' + convert(nvarchar(15),@NewThreshold15,101) + ''',101)
;'
print @sSQL;
Exec(@sSQL);
select @sSQL = 'ALTER TABLE [cStretch].[StretchyTester] SET
(
        REMOTE_DATA_ARCHIVE = ON
        ( 
                FILTER_PREDICATE = ' + @NewFnName + '(ThresholdDate), 
                MIGRATION_STATE = OUTBOUND 
        )
);'
print @sSQL;
Exec(@sSQL);

Schedule the job to run daily, weekly or whatever. With a little ingenuity, you can build a script that can read a list of stretched tables and replace the filter functions. It could even automatically drop those older filter functions no longer in use, and even build different types of filter functions for different thresholds.

Working Around the Constraints

There are also constraints that the Wizard does not tell you. The biggest one? You cannot UPDATE any row that has been stretched to Azure. The stretched data are absolutely read-only from the client side. There is a process that allows you to delete rows from the Remote Data Archive, using the REMOTE_ONLY hint. There are also workarounds for updating rows directly on the Azure SQL Database that holds the cold rows, but you must exercise care. In short, stretching is intended to be a painless archival solution for cold data.

Defaults

The easiest constraints to work around are defaults. You can easily do that with an INSTEAD OF INSERT trigger. Here is an example table:

CREATE TABLE StretchDemo1
(
        [ID] INT IDENTITY(1,1) NOT NULL
        ,[DefaultToZero] int not null default ((0))
        ,[SomeText] nvarchar(100) not null
        ,CONSTRAINT PL_StretchDemo1_ID PRIMARY KEY CLUSTERED (ID)
);

Drop the default from the table and replace it with a trigger for INSTEAD OF INSERT. It would look this:

ALTER TABLE dbo.StretchDemo1
        DROP CONSTRAINT DF__StretchDe__Defau__3C34F16F -- or whatever the default constraint name is
GO
CREATE TRIGGER tr_StretchDemo1_Defaults
ON dbo.StretchDemo1
INSTEAD OF INSERT
AS
BEGIN
        INSERT INTO [dbo].[StretchDemo1]
        (
                [DefaultToZero]
                ,[SomeText]
        )
        SELECT
                (case when [DefaultToZero] is not null then [DefaultToZero] else 0 end) as [DefaultToZero]
                ,[SomeText]
        from [inserted]
END;
GO

Then when you insert a row like this:

INSERT INTO [dbo].[StretchDemo1]
           ([DefaultToZero]
           ,[SomeText])
VALUES(NULL,'use The Trigger for defaults!');
GO
select * from [dbo].StretchDemo1;

…you will see that the default value(s) will have been applied. Now these constraints are no longer a problem for stretching the table. Check constraints can be similarly implemented. If your application will break if the constraint(s) are eliminated, then you have this option available to you to keep the plates spinning.

Foreign Keys: Why is this a problem?

Foreign Key constraints are much more difficult. I hate the idea of having unenforced FK constraints in a database. Take the potential situation of an Order Master table and an Order Line table. Microsoft lets us stretch the Order Line table, but if we want to stretch the Order Master table, we have to drop the explicit FK relationship. Again, we could emulate the constraint via triggers. This is an inconvenient and hard-to-maintain solution that will be a deal-breaker for many shops. Even worse is to remove the FK relationship and expect the inserting/updating/deleting application(s) to enforce the relationship. Consider this carefully. Microsoft needs to update Table Stretching to handle the relationship.

I understand why the restriction is present; it would require some enforcement of the FK relationship on both the local and cloud sides of the stretched tables. I can also understand that there is a concern about two scenarios:

  1. What to do if one or more of the child rows are remote but the parent row is local?
  2. The reverse - What to do if the Parent is remote but one or more child rows are local?

It seems to me that Microsoft could solve this issue by giving three choices. First would be to error out, second would be to move the parent and all children to remote, or finally to move parent and all children local. Another idea would be to set a stretch for parent that includes all children for every FK relationship as a single “stretch with cascade” predicate. This could and should have been thought out prior to releasing the Stretch Database feature. I expect that Microsoft will resolve this issue rapidly.

Coming up Next

In the conclusion, we'll deal with the things that a DBA will encounter in real life: Metrics, Altering Stretched Tables, Backup, Restore and lower-environment refreshes of stretched databases.

Rate

5 (2)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (2)

You rated this post out of 5. Change rating