Azure Data Factory Essentials Training

Learn How to Build a Complete ETL Solution in Azure Data Factory & How to Integrate Pipelines with Azure Databricks

TL;DR.

What you’ll learn

  • Introduction to Azure Data Factory. You will understand how it can be used to integrate many other technologies with an ever-growing list of connectors..
  • How to set up a Data Factory from scratch using the Azure Portal and PowerShell..
  • Activities and Components that makeup Data Factory. It will include Pipelines, Datasets, Triggers, Linked Services, and more..
  • How to transform, ingest, and integrate data code-free using Mapping Data Flows..
  • How to integrate Azure Data Factory and Databricks. We’ll cover how to authenticate and run a few notebooks from within ADF..
  • Azure Data Factory Deployment using Azure DevOps for continuous integration and continuous deployment (CI/CD).

Course Content

  • Modules Introduction –> 5 lectures • 4min.
  • Getting started –> 4 lectures • 19min.
  • Azure Data Factory Components –> 12 lectures • 55min.
  • Ingesting and Transforming Data –> 7 lectures • 49min.
  • Mapping Data Flows –> 15 lectures • 31min.
  • Integrating Azure Data Factory with Databricks –> 5 lectures • 29min.
  • Continuous Integration and Continuous Delivery (CI/CD) for Azure Data Factory –> 7 lectures • 28min.
  • Wrap-up –> 1 lecture • 1min.

Azure Data Factory Essentials Training

Requirements

  • No previous experience with Data Factory is required, we will work our way through step by step.

TL;DR.

This course will introduce Azure Data Factory and how it can help in the batch processing of data. Students will learn with hands-on activities, quizzes, and a project, how Data Factory can be used to integrate many other technologies together to build a complete ETL solution, including a CI/CD pipeline in Azure DevOps. Some topics related to Data Factory required for the exam DP-203: Data Engineering on Microsoft Azure, are covered in this course.

 

Learn by Doing

Together, you and I are going to learn everything you need to know about using Microsoft Azure Data Factory. This course will prepare you with hands-on learning activities, videos, and quizzes to help you gain knowledge and practical experience as we go along.

At the end of this course, students will have the opportunity to submit a project that will help them to understand how ADF works, what are the components, and how to integrate ADF and Databricks.

 

Student key takeaways:

  • The student should understand how ADF orchestrates the features of other technologies to transform or analyze data.
  • The student should be able to explain and use the components that make up ADF.
  • The student should be able to integrate two or more technologies using ADF.
  • The student should be able to confidently create medium complex data-driven pipelines
  • The student should be able to develop a CI/CD pipeline in Azure DevOps to deploy Data Factory pipelines

 

What You’ll Learn:

  • Introduction to Azure Data Factory. You will understand how it can be used to integrate many other technologies with an ever-growing list of connectors.
  • How to set up a Data Factory from scratch using the Azure Portal and PowerShell.
  • Activities and Components that makeup Data Factory. It will include Pipelines, Datasets, Triggers, Linked Services, and more.
  • How to transform, ingest, and integrate data code-free using Mapping Data Flows.
  • How to integrate Azure Data Factory and Databricks. We’ll cover how to authenticate and run a few notebooks from within ADF.
  • Azure Data Factory Deployment using Azure DevOps for continuous integration and continuous deployment (CI/CD)

 

Data Factory Essentials Training – Outline

  1. Introduction
  2. Modules introduction
    1. Getting Started
    2. Understand Azure Data Factory Components
    3. Ingesting and Transforming Data with Azure Data Factory
    4. Integrate Azure Data Factory with Databricks
    5. Continuous Integration and Continuous Delivery (CI/CD) for Azure Data Factory
  3. Getting started
    1. Sign up for your Azure free account
    2. Setting up a Budget
    3. How to set up Azure Data Factory
      1. Azure Portal
      2. PowerShell
  4. Azure Data Factory Components
    1. Linked Services
    2. Pipelines
    3. Datasets
    4. Data Factory Activities
    5. Parameters
      1. Pipeline Parameters
      2. Activity Parameters
      3. Global Parameters
    6. Triggers
    7. Integration Runtimes (IR)
      1. Azure IR
      2. Self-hosted IR
      3. Linked Self-Hosted IR
      4. Azure-SSIS IR
    8. Quiz
  5. Ingesting and Transforming Data
    1. Ingesting Data using Copy Activity into Data Lake Store Gen2
      1. How to Copy Parquet Files from AWS S3 to Azure SQL Database
        1. Creating ADF Linked Service for Azure SQL Database
        2. How to Grant Permissions on Azure SQL DB to Data Factory Managed Identity
        3. Ingesting Parquet File from S3 into Azure SQL Database
      2. Copy Parquet Files from AWS S3 into Data Lake and Azure SQL Database (intro)
        1. Copy Parquet Files from AWS S3 into Data Lake and Azure SQL Database
      3. Monitoring ADF Pipeline Execution
    2. Transforming data with Mapping Data Flow
      1. Mapping Data Flow Walk-through
      2. Identify transformations in Mapping Data Flow
        1. Multiple Inputs/Outputs
        2. Schema Modifier
        3. Formatters
        4. Row Modifier
        5. Destination
      3. Adding source to a Mapping Data Flow
        1. Defining Source Type; Dataset vs Inline
        2. Defining Source Options
        3. Spinning Up Data Flow Spark Cluster
        4. Defining Data Source Input Type
        5. Defining Data Schema
        6. Optimizing Loads with Partitions
        7. Data Preview from Source Transformation
      4. How to add a Sink to a Mapping Data Flow
      5. How to Execute a Mapping Data Flow
    3. Quiz
  6. Integrate Azure Data Factory with Databricks
    1. Project Walk-through
    2. How to Create Azure Databricks and Import Notebooks
    3. How to Transfer Data Using Databricks and Data Factory
    4. Validating Data Transfer in Databricks and Data Factory
    5. How to Use ADF to Orchestrate Data Transformation Using a Databricks Notebook
    6. Quiz
  7. Continuous Integration and Continuous Delivery (CI/CD) for Azure Data Factory
    1. How to Create an Azure DevOps Organization and Project
    2. How to Create a Git Repository in Azure DevOps
    3. How to Link Data Factory to Azure DevOps Repository
    4. How to version Azure Data Factory with Branches
      1. Data Factory Release Workflow
      2. Merging Data Factory Code to Collaboration Branch
    5. How to Create a CI/CD pipeline for Data Factory in Azure DevOps
      1. How to Create a CICD pipeline for Data Factory in Azure DevOps
      2. How to Execute a Release Pipeline in Azure DevOps for ADF
    6. Quiz