Tuesday 3 April 2018

Ingest, prepare, and transform using Azure Databricks and Data Factory

Today’s business managers depend heavily on reliable data integration systems that run complex ETL/ELT workflows (extract, transform/load and load/transform data). These workflows allow businesses to ingest data in various forms and shapes from different on-prem/cloud data sources; transform/shape the data and gain actionable insights into data to make important business decisions.

With the general availability of Azure Databricks comes support for doing ETL/ELT with Azure Data Factory. This integration allows you to operationalize ETL/ELT workflows (including analytics workloads in Azure Databricks) using data factory pipelines that do the following:

1. Ingest data at scale using 70+ on-prem/cloud data sources

2. Prepare and transform (clean, sort, merge, join, etc.) the ingested data in Azure Databricks as a Notebook activity step in data factory pipelines

3. Monitor and manage your E2E workflow.

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

Take a look at a sample data factory pipeline where we are ingesting data from Amazon S3 to Azure Blob, processing the ingested data using a Notebook running in Azure Databricks and moving the processed data in Azure SQL Datawarehouse.

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

You can parameterize the entire workflow (folder name, file name, etc.) using rich expression support and operationalize by defining a trigger in data factory.

Get started today!


We are excited for you to try Azure Databricks and Azure Data Factory integration and let us know your feedback.

Get started by clicking the Author & Monitor tile in your provisioned v2 data factory blade.

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

Click on the Transform data with Azure Databricks tutorial and learn step by step how to operationalize your ETL/ELT workloads including analytics workloads in Azure Databricks using Azure Data Factory.

Azure Tutorials and Materials, Azure Learning, Azure Certifications, Azure Guides

Related Posts

0 comments:

Post a Comment