Querying Temporal Tables

,

Temporal tables a.k.a Table Versioning was introduced in SQL Server 2016 and is an easy, convenient way to track changes to data. A good introduction to temporal tables can be found here. One of the key advantages of versioning tables is easy of querying – or getting a ‘single pane of glass’ look of how data in table looked at any given point in time. But it is important though to understand how to query this data, so that you get what you need in this regard.

There are 5 clauses we can use to query versioned data. More details on how to use them can be found here. These are mostly easy to use, with some nuances attached. It is important to know those to put them to optimal use.

  • AS OF <date_time>
  • FROM <start_date_time> TO <end_date_time>
  • BETWEEN <start_date_time> AND <end_date_time>
  • CONTAINED IN (<start_date_time> , <end_date_time>)
  • ALL

To get started am going to set up a table for versioning – similar to what I did in this article.

CREATE TABLE [dbo].[Region]
(RegionID INT IDENTITY(1,1) NOT NULL
     CONSTRAINT PK_Region PRIMARY KEY CLUSTERED,
 RegionDescription VARCHAR(100) NULL,
 StartDateTime datetime2 generated always as row start NOT NULL,
 EndDateTime datetime2 generated always as row end NOT NULL,
 PERIOD FOR SYSTEM_TIME (StartDateTime, EndDateTime))
 WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.Region_History));
INSERT INTO [dbo].[Region]
(RegionDescription)
values
('Northeast')
,('Southwest')
,('West')
,('Southeast')
,('Midwest');

Below script creates some history in the table that we can use to be queried upon and understand how querying works. This script may take up to 10 minutes to run as it is creating history records with multiple timestamps.

DECLARE @counter INT
SELECT @COUNTER = 100
WHILE @counter > 0
 BEGIN
  UPDATE [dbo].[Region]
   SET RegionDescription = 'NorthEast'
  WHERE RegionDescription = 'Northeast'
  WAITFOR DELAY '00:00:01'
  UPDATE [dbo].[Region]
    SET RegionDescription = 'Southwest '
   WHERE RegionDescription = 'Southwest'
  WAITFOR DELAY '00:00:01'
  UPDATE [dbo].[Region]
    SET RegionDescription = 'Southeast '
  WHERE RegionDescription = 'Southeast'
  WAITFOR DELAY '00:00:01'
  UPDATE [dbo].[Region]
   SET RegionDescription = 'Midwest '
  WHERE RegionDescription = 'Midwest'
  WAITFOR DELAY '00:00:01'
  UPDATE [dbo].[Region]
   SET RegionDescription = 'MidWest'
  WHERE RegionDescription = 'Midwest '
  WAITFOR DELAY '00:00:01'
  UPDATE [dbo].[Region]
   SET RegionDescription = 'SouthWest'
  WHERE RegionDescription = 'Southwest '
  WAITFOR DELAY '00:00:01'
  UPDATE [dbo].[Region]
   SET RegionDescription = 'SouthEast'
  WHERE RegionDescription = 'Southeast '
  SELECT @counter = @counter - 1
 END

Querying Temporal tables using AS OF

This clause is supposed to show how the table looks as of a certain date and time. So, I want to see how the table looked at 2019-04-21 18:00:00, before I did any changes.

SELECT *
 FROM [dbo].[Region]  
 FOR SYSTEM_TIME AS OF '2019-04-21 18:00:00.000'

There are no rows returned. Let us look at execution plan for this query and see what exactly it is doing under the hood.

If we right click and get the predicate for first table – it looks like this

[WideWorldImporters].[dbo].[Region].[StartDateTime]<='2019-04-21 18:00:00.0000000' AND [WideWorldImporters].[dbo].[Region].[EndDateTime]>'2019-04-21 18:00:00.0000000'

The history table’s predicate looks like below

[WideWorldImporters].[dbo].[Region_History].[StartDateTime]<='2019-04-21 18:00:00.0000000' AND [WideWorldImporters].[dbo].[Region_History].[EndDateTime]>'2019-04-21 18:00:00.0000000'

Looking at predicate in the concatenation operator shows us below

[Union1005] = ([WideWorldImporters].[dbo].[Region].RegionID, [WideWorldImporters].[dbo].[Region_History].RegionID), [Union1006] = ([WideWorldImporters].[dbo].[Region].RegionDescription, [WideWorldImporters].[dbo].[Region_History].RegionDescription), [Union1007] = ([WideWorldImporters].[dbo].[Region].StartDateTime, [WideWorldImporters].[dbo].[Region_History].StartDateTime), [Union1008] = ([WideWorldImporters].[dbo].[Region].EndDateTime, [WideWorldImporters].[dbo].[Region_History].EndDateTime)

What SQL Server is doing is combining data with start time less than or equal to passed date from main table with with history table for start date less than datetime passed for this key.

There is only one record per primary key returned, the one that is latest before the said datetime.

Records deleted before the datetime passed and after versioning started will be included, because we are asking for how the table looked at a given point in time, and that includes deleted data.

It is really important, knowing this, to pass in the right dates for queries. To do this get minimum and maximum date for versioning, as below.

SELECT min(startdatetime), max(enddatetime)
 from Region_history

Querying with any date that is equal or more than the start date gets data.

SELECT *
 FROM [dbo].[Region]  
 FOR SYSTEM_TIME AS OF '2019-06-13 22:50:08.5109251'

SQL Server does not care if you supply any date that is greater than the end date, it presumes the data is what it is now (when you query) and returns the same data. In this case the date is some random future date.

SELECT *
 FROM [dbo].[Region]  
 FOR SYSTEM_TIME AS OF '2019-06-16'

It is possible to run same query on a view. If all tables in the view are system versioned, SQL Server applies the same logic on every table involved – combining data with start time less than or equal to passed date from main table with history table for start date less than datetime passed for this key. It is important therefore that all tables involved should have been versioned around the same time for this to work.

Let us consider this view that is part of WideWorldImporters database.

CREATE VIEW [Website].[Customers]
AS
SELECT s.CustomerID,
       s.CustomerName,
       sc.CustomerCategoryName,
       pp.FullName AS PrimaryContact,
       ap.FullName AS AlternateContact,
       s.PhoneNumber,
       s.FaxNumber,
       bg.BuyingGroupName,
       s.WebsiteURL,
       dm.DeliveryMethodName AS DeliveryMethod,
       c.CityName AS CityName,
       s.DeliveryLocation AS DeliveryLocation,
       s.DeliveryRun,
       s.RunPosition
FROM Sales.Customers AS s
 LEFT OUTER JOIN Sales.CustomerCategories AS sc
  ON s.CustomerCategoryID = sc.CustomerCategoryID
 LEFT OUTER JOIN [Application].People AS pp
  ON s.PrimaryContactPersonID = pp.PersonID
 LEFT OUTER JOIN [Application].People AS ap
  ON s.AlternateContactPersonID = ap.PersonID
 LEFT OUTER JOIN Sales.BuyingGroups AS bg
  ON s.BuyingGroupID = bg.BuyingGroupID
 LEFT OUTER JOIN [Application].DeliveryMethods AS dm
  ON s.DeliveryMethodID = dm.DeliveryMethodID
 LEFT OUTER JOIN [Application].Cities AS c
  ON s.DeliveryCityID = c.CityID
GO

All the tables involved here happen to be versioned.

SELECT MIN(validfrom) AS customer_archive_version_date
 FROM sales.customers_archive
SELECT MIN(validfrom) AS customercategories_version_date
 FROM sales.customercategories_archive
SELECT MIN(validfrom) AS people_archive_version_date
 FROM application.people_archive
SELECT MIN(validfrom) AS buyinggroups_archive_version_date
 FROM sales.buyinggroups_archive
SELECT MIN(validfrom) AS deliverymethods_archive_version_date
 FROM application.deliverymethods_archive
SELECT MIN(validfrom) AS cities_archive_version_date
 FROM application.cities_archive

We get this

Other than buying groups there is a minimum date for all the rest. This doesn’t mean buying groups is not versioned, it only means there were no changes to it after it was versioned.

So when we try to query this view like below

SELECT *
 FROM [Website].[Customers] 
 FOR SYSTEM_TIME AS OF '2014-01-01 16:00:00.0000000'

626 rows are returned.

The execution plan and predicate show as below

[WideWorldImporters].tablename.[ValidFrom] as [dm].[ValidFrom]<= '2014-01-01 16:00:00.0000000'AND [WideWorldImporters].tablename.[ValidFrom] as [dm].[ValidTo]> '2014-01-01 16:00:00.0000000'

This is applied to every table in the query.

If a table not versioned is part of the view, there is no such predicate applied. It is treated as a regular join.

Querying Temporal Tables Using FROM <> TO clause

Below is an example of a query that uses a date range instead of 'as of' a specific date.

SELECT *
 FROM [dbo].[Region]
 FOR SYSTEM_TIME FROM '2019-06-29 21:20:01.0643764' TO '2019-06-29 22:00'
 ORDER BY regionid

The results here, particularly the end date on the first row - is a bit puzzling. It is not really about what was changed between the two dates.

Looking at the predicate on the table Region from the execution plan -

[WideWorldImporters].[dbo].[Region].[StartDateTime]<'2019-06-29 22:00:00.0000000' AND [WideWorldImporters].[dbo].[Region].[EndDateTime]>'2019-06-29 21:20:01.0643764' AND [WideWorldImporters].[dbo].[Region].[StartDateTime]<>[WideWorldImporters].[dbo].[Region].[EndDateTime]

This means rows that had an end date that was after to date(and still current) and start date that was before from date.  It is about data that was changed after start date and is valid beyond the end date.

Querying Temporal tables Using the BETWEEN Clause

SELECT *
 FROM dbo.Region
 FOR SYSTEM_TIME BETWEEN  '2019-06-13' AND '2019-06-14'

This Returns 705 rows.

Upon looking at execution plan –

The predicates on each of the table scans are as below.

[ReportData].[dbo].[Region].[StartDateTime]<='2019-06-14 00:00:00.0000000' AND [ReportData].[dbo].[Region].[EndDateTime]>'2019-06-13 00:00:00.0000000' AND [ReportData].[dbo].[Region].[StartDateTime]<>[ReportData].[dbo].[Region].[EndDateTime]

[ReportData].[dbo].[Region_History].[StartDateTime]<='2019-06-14 00:00:00.0000000' AND [ReportData].[dbo].[Region_History].[EndDateTime]>'2019-06-13 00:00:00.0000000' AND [ReportData].[dbo].[Region_History].[StartDateTime]<>[ReportData].[dbo].[Region_History].[EndDateTime]

The between clause, therefore, returns data that was changed after or equal to start date and is valid beyond the end date.

Querying Temporal tables using IN clause

SELECT *
 FROM dbo.Region
 FOR SYSTEM_TIME CONTAINED IN ( '2019-06-13' , '2019-06-14')

Looking at execution plan -

The predicates for the two table scans are as below –

[ReportData].[dbo].[Region].[StartDateTime]>='2019-06-13 00:00:00.0000000' AND [ReportData].[dbo].[Region].[EndDateTime]<='2019-06-14 00:00:00.0000000' AND [ReportData].[dbo].[Region].[StartDateTime]<>[ReportData].[dbo].[Region].[EndDateTime]

[ReportData].[dbo].[Region_History].[StartDateTime]>='2019-06-13 00:00:00.0000000' AND [ReportData].[dbo].[Region_History].[EndDateTime]<='2019-06-14 00:00:00.0000000' AND [ReportData].[dbo].[Region_History].[StartDateTime]<>[ReportData].[dbo].[Region_History].[EndDateTime].

Data pulled by IN clause is what was changed after or equal to start date and is valid beyond  or equal to the end date.

Querying Temporal tables using ALL clause

SELECT *
 FROM dbo.Region
 FOR SYSTEM_TIME ALL

This returns rows with all changes on all data from date it was versioned.

The predicates on two tables show as below

[ReportData].[dbo].[Region].[StartDateTime]<>[ReportData].[dbo].[Region].[EndDateTime]

[ReportData].[dbo].[Region_History].[StartDateTime]<>[ReportData].[dbo].[Region_History].[EndDateTime]

Both tables are being scanned to pull all this data. This can be an expensive query. It may be a better idea to query the history table alone for changes, if all changes are needed and current data is not necessary.

The history table that stores versioned data can grow really large over time – since there is no filter to what it records.

Summary

It is important to understand how versioning works to put it to optimum use. Understanding how the date filter gets applied for each clause, which table is versioned and which isn't when using a view and when it is more appropriate to just query on history table instead of using any temporal table related clauses are some of the things we looked at and will come in handy in this regard.

Rate

5 (2)

Share

Share

Rate

5 (2)