SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

What is Azure Data Factory?

The Azure Data Factory is a service designed to allow developers to integrate disparate data sources.  It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud.

It provides access to on-premises data in SQL Server and cloud data in Azure Storage (Blob and Tables) and Azure SQL Database.  Access to on-premises data is provided through a data management gateway that connects to on-premises SQL Server databases.

It is not a drag-and-drop interface like SSIS.  Instead, data processing is enabled initially through Hive, Pig and custom C# activities.  Such activities can be used to clean data, mask data fields, and transform data in a wide variety of complex ways.

You will author your activities, combine them into a pipeline, set an execution schedule and you’re done.  Data Factory also provides an up-to-the moment monitoring dashboard, which means you can deploy your data pipelines and immediately begin to view them as part of your monitoring dashboard.

Within the Azure Preview Portal, you get a visual layout of all of your pipelines and data inputs and outputs.  You can see all the relationships and dependencies of your data pipelines across all of your sources so you always know where data is coming from and where it is going.  You get a historical accounting of job execution, data production status, and system health in a single monitoring dashboard.

Data Factory provides customers with a central place to manage their processing of web log analytics, click stream analysis, social sentiment, sensor data analysis, geo-location analysis, etc.

Microsoft views Data Factory as a key tool for customers who are looking to have a hybrid story with SQL Server or who currently use Azure HDInsight, Azure SQL Database, Azure Blobs, and Power BI for Office 365.

In short, developers can use Data Factory to transform semi-structured, unstructured and structured data from on-premises and cloud sources into trusted information.


More info:

Microsoft takes wraps off preview of its Azure Data Factory service

Data Factory Public Preview – build and manage information production pipelines

Video Azure Data Factory Overview

Introduction to Azure Data Factory Service

The Ins and Outs of Azure Data Factory – Orchestration and Management of Diverse Data

Azure Data Factory Update – New Data Stores

James Serra's Blog

James is a big data and data warehousing technology specialist at Microsoft. He is a thought leader in the use and application of Big Data technologies, including MPP solutions involving hybrid technologies of relational data, Hadoop, and private and public cloud. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 30 years of IT experience. James is a popular blogger (JamesSerra.com) and speaker, having presented at dozens of PASS events including the PASS Business Analytics conference and the PASS Summit. He is the author of the book “Reporting with Microsoft SQL Server 2012”. He received a Bachelor of Science degree in Computer Engineering from the University of Nevada-Las Vegas.


Leave a comment on the original post [www.jamesserra.com, opens in a new window]

Loading comments...