Member-only story
How to Read Data from Databricks Unity Catalog with Azure Data Factory (Step-by-Step + Gotchas)
You’ve got governed data in Unity Catalog (UC) and a downstream system — like an on-prem SQL Server or Azure SQL Database — that needs it, reliably and securely. Here’s a clean, repeatable way to pull data from UC with Azure Data Factory (ADF), plus all the little details that usually trip people up.
Watch demonstration of this tutorial on youtube: https://www.youtube.com/watch?v=saVZdf6cQig&ab_channel=AfroInfoTech
Why use ADF for this?
In many projects, I default to a Databricks notebook: read from UC into a Spark DataFrame, then write out with the JDBC connector. That’s simple if your workspace can reach the target database (networking, firewall, credentials, etc.).
But there are plenty of cases where ADF is the better tool:
- You already have a Self-Hosted Integration Runtime (SHIR) that can reach your on-prem SQL Server.
- You want no-code/low-code copy with built-in retries, monitoring, and alerting.
- Your security model prefers Managed Identity + RBAC instead of user PATs.
This guide shows you how to move data from Unity Catalog → (staging in ADLS) → SQL using ADF Copy activity.
