DP-203T00 Data Engineering on Microsoft Azure (UT-Microsoft-DP-203T00)

Course Description

In this course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.

Course Outline

1 - Introduction to data engineering on Azure

What is data engineering
Important data engineering concepts
Data engineering in Microsoft Azure

2 - Introduction to Azure Data Lake Storage Gen2

Understand Azure Data Lake Storage Gen2
Enable Azure Data Lake Storage Gen2 in Azure Storage
Compare Azure Data Lake Store to Azure Blob storage
Understand the stages for processing big data
Use Azure Data Lake Storage Gen2 in data analytics workloads

3 - Introduction to Azure Synapse Analytics

What is Azure Synapse Analytics
How Azure Synapse Analytics works
When to use Azure Synapse Analytics

4 - Use Azure Synapse serverless SQL pool to query files in a data lake

Understand Azure Synapse serverless SQL pool capabilities and use cases
Query files using a serverless SQL pool
Create external database objects

5 - Use Azure Synapse serverless SQL pools to transform data in a data lake

Transform data files with the CREATE EXTERNAL TABLE AS SELECT statement
Encapsulate data transformations in a stored procedure
Include a data transformation stored procedure in a pipeline

6 - Create a lake database in Azure Synapse Analytics

Understand lake database concepts
Explore database templates
Create a lake database
Use a lake database

7 - Analyze data with Apache Spark in Azure Synapse Analytics

Get to know Apache Spark
Use Spark in Azure Synapse Analytics
Analyze data with Spark
Visualize data with Spark

8 - Transform data with Spark in Azure Synapse Analytics

Modify and save dataframes
Partition data files
Transform data with SQL

9 - Use Delta Lake in Azure Synapse Analytics

Understand Delta Lake
Create Delta Lake tables
Create catalog tables
Use Delta Lake with streaming data
Use Delta Lake in a SQL pool

10 - Analyze data in a relational data warehouse

Design a data warehouse schema
Create data warehouse tables
Load data warehouse tables
Query a data warehouse

11 - Load data into a relational data warehouse

Load staging tables
Load dimension tables
Load time dimension tables
Load slowly changing dimensions
Load fact tables
Perform post load optimization

12 - Build a data pipeline in Azure Synapse Analytics

Understand pipelines in Azure Synapse Analytics
Create a pipeline in Azure Synapse Studio
Define data flows
Run a pipeline

13 - Use Spark Notebooks in an Azure Synapse Pipeline

Understand Synapse Notebooks and Pipelines
Use a Synapse notebook activity in a pipeline
Use parameters in a notebook

14 - Plan hybrid transactional and analytical processing using Azure Synapse Analytics

Understand hybrid transactional and analytical processing patterns
Describe Azure Synapse Link

15 - Implement Azure Synapse Link with Azure Cosmos DB

Enable Cosmos DB account to use Azure Synapse Link
Create an analytical store enabled container
Create a linked service for Cosmos DB
Query Cosmos DB data with Spark
Query Cosmos DB with Synapse SQL

16 - Implement Azure Synapse Link for SQL

What is Azure Synapse Link for SQL?
Configure Azure Synapse Link for Azure SQL Database
Configure Azure Synapse Link for SQL Server 2022

17 - Get started with Azure Stream Analytics

Understand data streams
Understand event processing
Understand window functions

18 - Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics

Stream ingestion scenarios
Configure inputs and outputs
Define a query to select, filter, and aggregate data
Run a job to ingest data

19 - Visualize real-time data with Azure Stream Analytics and Power BI

Use a Power BI output in Azure Stream Analytics
Create a query for real-time visualization
Create real-time data visualizations in Power BI

20 - Introduction to Microsoft Purview

What is Microsoft Purview?
How Microsoft Purview works
When to use Microsoft Purview

21 - Integrate Microsoft Purview and Azure Synapse Analytics

Catalog Azure Synapse Analytics data assets in Microsoft Purview
Connect Microsoft Purview to an Azure Synapse Analytics workspace
Search a Purview catalog in Synapse Studio
Track data lineage in pipelines

22 - Explore Azure Databricks

Get started with Azure Databricks
Identify Azure Databricks workloads
Understand key concepts

23 - Use Apache Spark in Azure Databricks

Get to know Spark
Create a Spark cluster
Use Spark in notebooks
Use Spark to work with data files
Visualize data

24 - Run Azure Databricks Notebooks with Azure Data Factory

Understand Azure Databricks notebooks and pipelines
Create a linked service for Azure Databricks
Use a Notebook activity in a pipeline
Use parameters in a notebook

Course Prerequisites

Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions.

AZ-900T00 Microsoft Azure Fundamentals
DP-900T00 Microsoft Azure Data Fundamentals

Course Information

Length: 4 day

Format: Lecture and Lab

Delivery Method: Virtual

Max. Capacity: 16

Schedule

Contact Us

Do you have more questions? We're delighted to assist you!

1-877-797-2799

info@firefly.cloud

Who Should Attend

The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course includes data analysts and data scientists who work with analytical solutions built on Microsoft Azure.