SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Azure Data Lake Store Gen2

Big news!  The next generation of Azure Data Lake Store (ADLS) has arrived.  See the official announcement.

In short, ADLS Gen2 is the combination of the current ADLS (now called Gen1) and Blob storage.  Gen2 is built on Blob storage.  By GA, ADLS Gen2 will have all the features of both, which means it will have features such as limitless storage capacity, support all Blob tiers (Hot, Cool, and Archive), the new lifecycle management feature, Azure Active Directory integration, hierarchical file system, and read-access geo-redundant storage.

A Gen2 capability is what is called “multi-modal” which means customers can use either Blob object store APIs or the new Gen2 file system APIs.  The key here is that both blob and file system semantics are now supported over the same data.

For existing customers of Gen1, once Gen2 is GA, no new features will be added to Gen1.  Customers can stay on Gen1 if they don’t need any new capabilities or can move to Gen2 where they can leverage all the goodness of the combined capabilities.  They can upgrade when they chose to do so.

Existing customers of Blob storage can continue to use Blob storage to save a bit of money (storage costs will be the same between Blob and Gen2 but transaction costs will be a bit higher for Gen2 due to the overhead of namespaces).  By GA, existing Blob storage accounts will just need to “enable Gen2” to get all the features of Gen2.  Before GA, they will need to copy their data from Blob storage to Gen2.

New customers should go with Gen2 unless the simplicity of an object store is all that is needed – for example,  storing images, storing backup data, website hosting, etc where the apps really don’t benefit from a file system namespace and the customer wants to save a bit of money on transaction costs.
?
Note that Blob storage and ADLS Gen1 will continue to exist and that Gen2 pricing will be roughly half of Gen1.

It was announced yesterday (June 27th) and be available for a limited public preview (customers will have to sign up).

Because ADLS Gen2 is part of blob storage, it is a “ring 0” service and will at GA be available in all regions.  The limited public preview program kicks off with two regions in the US with new regions added throughout the preview window.

For those using the current Blob SDK’s: Initially the SDK’s are different and some code changes will be required.  Microsoft is looking at whether they can reduce the need for code changes.  For customers using the WASB or ADLS driver, it will be as simple as switching to the new Gen2 driver and changing configs.

Check out the Azure Data Lake Storage Gen2 overview video for more info as well as A closer look at Azure Data Lake Storage Gen2 and finally check out the Gen2 documentation.

James Serra's Blog

James is a big data and data warehousing technology specialist at Microsoft. He is a thought leader in the use and application of Big Data technologies, including MPP solutions involving hybrid technologies of relational data, Hadoop, and private and public cloud. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 30 years of IT experience. James is a popular blogger (JamesSerra.com) and speaker, having presented at dozens of PASS events including the PASS Business Analytics conference and the PASS Summit. He is the author of the book “Reporting with Microsoft SQL Server 2012”. He received a Bachelor of Science degree in Computer Engineering from the University of Nevada-Las Vegas.

Comments

Leave a comment on the original post [www.jamesserra.com, opens in a new window]

Loading comments...