SQLServerCentral Article

Consuming JSON Formatted API Data in 2016

,

There are new OPENJSON and FOR JSON commands in SQL Server 2016. Let's see if we can use OPENJSON to grab publically available data from the thousands of JSON formatted API's across the internet.  Let's create the sample Blockchain database with the btc.Blocks table.

USE [master]
GO
CREATE DATABASE [Blockchain]
GO
ALTER DATABASE [Blockchain] SET COMPATIBILITY_LEVEL = 130
GO
EXEC sys.sp_db_vardecimal_storage_format N'Blockchain', N'ON'
GO
USE [Blockchain]
GO
CREATE SCHEMA [btc]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [btc].[Blocks](
[BlockID] [int] NOT NULL,
[BlockVersion] [int] NULL,
[BlockHash] [nvarchar](32) NULL,
[PreviousBlockHash] [nvarchar](32) NULL,
[BlockTimestamp] [datetime] NULL,
PRIMARY KEY CLUSTERED 
(
[BlockID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
USE [master]
GO
ALTER DATABASE [Blockchain] SET  READ_WRITE 
GO

In order to reach the API, we need to enable Ole Automation to open up the sp_OA system stored procedures

From there we will create two functions.  The first function will allow you to enter the block height of BlockChain.info's API.  To see the data in your browser, you can click on this link: https://blockchain.info/block-height/0?format=json.  The block height that will pull is block 0, Satoshi's genesis block.  You can change the 0 in the URL to pull any block up to the current block height (a little over 400,000 blocks).

Unfortunately BlockChain.info stores the original UNIX timestamp in the blockXXX.dat data files from the chain.  In order to convert this to SQL Server datetime, the script below will add and a second function (btc.fn_UNIXConvertToDateTime) for the conversion.

-- Enable Ole Automation
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'Ole Automation Procedures', 1;
GO
RECONFIGURE;
GO
sp_configure 'show advanced options', 0;
GO
RECONFIGURE;
GO
-- This function will use the CLR system stored procedures to make a request to the API url and bring data back as an NVARCHAR(4000) variable
CREATE FUNCTION [btc].[fn_GetBlockFromAPI]
(@BlockHeight INT)
RETURNS NVARCHAR(MAX)
AS
BEGIN
    DECLARE @urlVARCHAR(255);
    DECLARE @objINT;
    DECLARE @hrINT;
    DECLARE @msgVARCHAR(8000);
    DECLARE @jsonNVARCHAR(4000);--Cannot use MAX with CLR stored procedures
    SET @url = 'https://blockchain.info/block-height/' + CAST(@BlockHeight AS VARCHAR(255)) + '?format=json';
    EXEC @hr = sp_OACreate 'WinHttp.WinHttpRequest.5.1', @obj OUT  -- Creates an instance of an OLE object
    IF @hr <> 0 BEGIN SET @Msg = 'sp_OACreate WinHttp.WinHttpRequest.5.1 failed. Cannot connect to Blockchain.info' GOTO Error END
    EXEC @hr = sp_OAMethod @obj, 'Open', NULL, 'GET', @Url, false  -- Calls a method of an OLE object.
    IF @hr <> 0 BEGIN SET @msg = 'sp_OAMethod Open failed. Cannot connect to Blockchain.info' GOTO Error END
    EXEC @hr = sp_OAMethod @obj, 'SetRequestHeader', NULL, 'Content-Type', 'multipart/form-data'
    IF @hr <> 0 BEGIN SET @msg = 'sp_OAMethod SetRequestHeader failed. Cannot connect to Blockchain.info' GOTO Error END
    EXEC @hr = sp_OAMethod @obj, SEND, NULL, ''
    IF @hr <> 0 BEGIN SET @msg = 'sp_OAMethod Send failed. Cannot connect to Blockchain.info' GOTO Error END
    EXEC @hr = sp_OAGetProperty @Obj, 'ResponseText', @json OUTPUT
    IF @hr <> 0 EXEC sp_OAGetErrorInfo @Obj
    EXEC @hr = sp_OADestroy @obj
    RETURN @json
    Error:
    EXEC @hr = sp_OADestroy @obj
    RETURN @msg
END
GO
-- API is holding UNIX time, so we will need to convert to SQL Server datetime format with this function
CREATE FUNCTION [btc].[fn_UNIXConvertToDateTime] (@DateTime BIGINT)
RETURNS DATETIME
AS
BEGIN
    DECLARE @LocalTimeOffset BIGINT
           ,@AdjustedLocalDatetime BIGINT;
    SET @LocalTimeOffset = DATEDIFF(second,GETDATE(),GETUTCDATE())
    SET @AdjustedLocalDatetime = @Datetime - @LocalTimeOffset
    RETURN (SELECT DATEADD(SECOND,@AdjustedLocalDatetime, CAST('1970-01-01 00:00:00' AS DATETIME)))
END;

Now lets call the btc.fn_GetBlockFromAPI function to grab the JSON data and insert into the @Block temp table.  From there we will insert the @Block data into the btc.Blocks table IF (SELECT MainChain FROM @BlcokTemp) = 1.  These are main chain blocks (not orphaned) and we will also convert the nasty UNIX timestamp.

DECLARE @BlockHeight INT;
SET @BlockHeight = 0-- Height of block on the chain you want data for
DECLARE @Block VARCHAR(MAX);
SET @Block = (SELECT [btc].[fn_GetBlockFromAPI](@BlockHeight));
DECLARE @BlockTemp TABLE(
[BlockID]INTNOT NULL,
[BlockVersion]INTNOT NULL,
[BlockHash]VARCHAR(32)NOT NULL,
[PreviousBlockHash] VARCHAR(32)NOT NULL,
[BlockTimestamp]VARCHAR(10)NOT NULL,
[MainChain]BITNOT NULL
);
INSERT INTO @BlockTemp ([BlockID], [BlockVersion], [BlockHash], [PreviousBlockHash], [BlockTimestamp], [MainChain]) 
SELECT * 
FROM OPENJSON (@Block, '$.blocks')
WITH ( 
        BlockIDINT'$.height',
BlockVersionINT'$.ver',
        BlockHashVARCHAR(32)'$.hash',
PreviousBlockHashVARCHAR(32)'$.prev_block',
BlockTimestampVARCHAR(10)'$.time',
MainChainBIT'$.main_chain'
)
-- Now only INSERT the data from the temp table if MainChain = 1 (True) and also convert the UNIX timestamp
IF (SELECT MainChain FROM @BlockTemp) = 1
BEGIN
INSERT INTO btc.Blocks ([BlockID], [BlockVersion], [BlockHash], [PreviousBlockHash], [BlockTimestamp])
SELECT [BlockID], [BlockVersion], [BlockHash], [PreviousBlockHash], btc.fn_UNIXConvertToDateTime([BlockTimestamp])
FROM @BlockTemp
END

Result set from the INSERT statement:

Result Set

Of course you can add a loop to pull rows sequentially, but some API providers are sensitive about their load.  If the API provider is hit too hard they may ban your IP, unless you space out your API calls or sign up for and API Code to bypass limits.

Disadvantages:

  • One major restriction to be aware of is the 4000 character limit of sp_OACreate.  If you need to pull JSON data from an API that may have over 4000 characters, the result set will return NULL.  However, this process can be highly effective for small amounts of data you need to consume frequently if you know the JSON data source will never exceed 4000 characters.
  • Must turn on Ole Automation, be sure to check your security needs and settings. 

Advantages:

  • Bypasses Python, Java, PHP, Ruby, etc libraries to reach the JSON target (and the devs that go with them)
  • Can be implemented and managed by the DBA or SQL Developer natively within SQL Server 2016

Now that you know how to pull data from an API natively, you can search for publicly available data from the world largest API directory.  This tool should be great for data warehousing or reporting needs to compare public data with your private date to see how they benchmark.

Rate

5 (11)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (11)

You rated this post out of 5. Change rating