Open to work
Published on

Azure Data Factory (ADF) in Action

Authors

Azure Data Factory (ADF) in Action

What is azure data factory?

Azure data factory (ADF) is a data processing service of azure. It supports ETL (extract transform load), ELT (extract load transform), and SSIS package data integration. It is an independent cloud-based service where you can run services pay-as-you-go. You can transform your data from various sources into various destinations. ADF is a data processing factory where you can process your data from different sources.

When to use azure data factory service?

  • Copying data from one data source to another

  • Transform data from a different source

  • Data migration pipeline for code-free migration

  • Template based support

  • Run SSIS package in Azure SQL

  • Schedule SSIS package in Azure SQL

  • Pipeline support

    And many more.

Getting Started

Before getting started with ADF you need to create a Data Factory service in azure.

  1. Find Data Factories in portal.azure.com
  2. Create Data Factory
  3. Open Azure Data Factory Studio

Main Features

  • Ingest
  • Orchestrate
  • Transform Data
  • SSIS

Ingest

Ingestion is copy to a data tool that copies data from one or more sources into a destination. It will load tables from the source to copy data towards the destination.

  1. Create source connection.
  2. All datasets will be loaded for preview
  3. Create a destination connection.
  4. Map tables and columns with the destination
  5. Run and view progress.

alt_text

Orchestrate

Orchestration is another powerful feature of ADF. In orchestration, we can create a pipeline using various types of activities. Also, we can create visual data flow based on various activities. Activities have the following features. Each of the activities has activity-specified configurations.

  • Move & Transform
  • Azure Data Explorer
  • Azure Function
  • Custom Service
  • Databricks & Data lake
  • HDInsight for big data
  • Iteration & conditional
  • Machine Learning
  • Power Query

alt_text

There are common processes to orchestrate data

  1. Create and validate data flow
  2. Configure datasets
  3. Run & view progress

Transform

Transformation of data is another delightful feature of ADF. Any data can be transformed using tons of transformation features. In transformation, data can be modified by joining, union, aggregation, pivoting, and many more.

The process to transform data

  1. Choose a dataset and create a connection with the datasets

alt_text

  1. Create dataflow and configure sources from datasets
  2. Add transformations into data

alt_text

  1. You can add a power query.
  2. Publish and view the progress of transformation.

SSIS

Azure SQL does not support SSIS natively. ADF is used to run the SSIS package into Azure SQL. Integration Services can extract and transform data from a wide variety of sources such as XML data files, flat files, and relational data sources, and then load the data into one or more destinations.

  1. Configure SSIS

alt_text

  1. Setup integration runtime settings (Create SSIS DB)
  2. Deploy run and monitor packages.
  3. Scheduling SSIS

Conclusion

Now ADF has the full support of version control. You can control your ADF setup and environment using the git repository.