Microsoft fabric overview
Microsoft Fabric Architecture: A Deep Technical Overview of OneLake, Delta Lake, Lakehouse, Warehouse, and Power BI Direct Lake
- Introduction Microsoft Fabric is a unified analytics platform built around a single architectural principle:
All data lives in OneLake as Delta Lake tables, and multiple compute engines operate on the same files.
This document explains the underlying technologies, how each engine interacts with Delta Lake, and how compute is performed across Lakehouse, Warehouse, SQL Database, and Power BI Direct Lake.
- OneLake: The Universal Storage Layer OneLake is the foundational storage system for Microsoft Fabric. It is:
- A single, tenant-wide data lake
- Built on Azure Data Lake Storage Gen2
- Using Delta Lake as the universal table format
All Fabric workloads — SQL, Spark, Power BI, Data Factory, Real-Time Analytics — read and write data stored in OneLake.
2.1 Storage Format All structured data in OneLake is stored as:
- Parquet files (columnar storage)
- Delta Lake transaction logs (ACID metadata)
This combination provides:
- ACID transactions
- Time travel
- Schema enforcement
- Concurrent writes
- Efficient columnar analytics
- Delta Lake: The Table Format Used Across Fabric Delta Lake is an open table format consisting of:
3.1 Parquet Data Files These contain the actual columnar data:
part-00000-...snappy.parquet part-00001-...snappy.parquet ...
3.2 The deltalog Folder This folder contains:
- JSON transaction log files
- Periodic Parquet checkpoints
Example:
deltalog/ 00000000000000000000.json 00000000000000000001.json 00000000000000000002.json 00000000000000000003.checkpoint.parquet
Each JSON file describes:
- Added Parquet files
- Removed Parquet files
- Schema changes
- Metadata updates
This log-based architecture enables:
- ACID transactions
- Time travel
- Optimistic concurrency
- Efficient metadata operations
- Fabric Lakehouse The Lakehouse is the Spark-first compute environment in Fabric. It is designed for:
- Data engineering
- ELT pipelines
- Machine learning
- Notebook-based development
- Direct file access
4.1 Storage Lakehouse stores all tables as Delta Lake (Parquet + transaction logs) in OneLake.
4.2 Compute Engine Lakehouse uses:
- Apache Spark
- Spark SQL
- Python, Scala, SQL notebooks
4.3 Characteristics
- Flexible schema
- Supports raw files, semi-structured data, and Delta tables
- Ideal for ingestion, transformation, and ML workloads
- Not optimized for BI star-schema workloads
- Fabric Warehouse The Warehouse is the SQL-first compute engine in Fabric. It is designed for:
- BI workloads
- Star schemas
- SQL-based transformations
- High-concurrency analytical queries
5.1 Storage Warehouse also stores all tables as Delta Lake in OneLake.
5.2 Compute Engine Warehouse uses a dedicated distributed SQL engine, separate from Spark.
Characteristics:
- T‑SQL surface area
- Columnar execution
- Materialized views
- Schema enforcement
- Optimized for BI workloads
5.3 Comparison to Snowflake Fabric Warehouse is conceptually similar to Snowflake:
- Separation of compute and storage
- Distributed SQL engine
- Columnar processing
- ACID tables
The difference is that Fabric uses Delta Lake instead of Snowflake’s proprietary micro-partition format.
- Fabric SQL Database (OLTP) Fabric SQL Database is the OLTP engine in Fabric. It is:
- The same engine as Azure SQL Database
- Optimized for transactional workloads
- Not designed for large-scale analytics
6.1 OneLake Mirroring Fabric SQL Database automatically mirrors its tables into OneLake as Delta Lake tables.
This enables:
- Spark access
- Warehouse SQL access
- Power BI Direct Lake access
6.2 Important Limitation Mirroring is one-way:
- SQL Database → Delta Lake (automatic)
- Delta Lake → SQL Database (manual ETL required)
- Power BI Direct Lake Power BI Direct Lake is a new mode that allows Power BI to read Delta Lake tables directly from OneLake without:
- Importing data
- Using DirectQuery
- Using the Warehouse SQL engine
- Using Spark
7.1 Compute Engine Direct Lake uses the VertiPaq engine, the same in-memory columnar engine used by:
- Power BI Import mode
- SSAS Tabular
- Azure Analysis Services
7.2 How It Works VertiPaq:
- Reads Parquet files directly
- Interprets Delta transaction logs
- Loads data into memory on demand
- Executes queries with columnar vectorized execution
7.3 Characteristics
- Near-import performance
- Real-time freshness
- Zero data movement
- No SQL pushdown
Direct Lake is effectively a third compute engine in Fabric.
- How All Engines Share the Same Data All engines operate on the same Delta Lake tables stored in OneLake:
| Engine | Purpose | Reads Delta Lake? | Compute Type |
|---|---|---|---|
| Lakehouse (Spark) | Engineering, ML | Yes | Spark |
| Warehouse (SQL) | BI, star schemas | Yes | Distributed SQL |
| SQL Database (OLTP) | Transactions | Mirrors to Delta | Azure SQL engine |
| Power BI Direct Lake | BI visualization | Yes | VertiPaq |
This architecture eliminates data silos and enables multi-engine analytics on a single copy of data.
- Summary Microsoft Fabric is built on a unified architecture:
- OneLake is the universal storage layer
- Delta Lake is the universal table format
- Lakehouse uses Spark compute
- Warehouse uses a distributed SQL engine
- SQL Database mirrors OLTP tables into Delta Lake
- Power BI Direct Lake uses VertiPaq to read Delta files directly
The result is a platform where:
- Data is stored once
- Multiple engines compute on the same files
- No ETL is required between analytical engines
- BI workloads can achieve real-time performance without imports
This architecture combines the strengths of:
- Databricks (Spark + Delta Lake)
- Snowflake (SQL warehouse)
- Azure SQL (OLTP)
- Power BI (semantic modeling + VertiPaq)
into a single integrated system.
Direct Lake is the VertiPaq engine reading Delta Lake files directly from OneLake.
Steps:
- Create a Power BI semantic model in the same workspace.
- Point it at your Warehouse or Lakehouse tables.
- Ensure the model is in Direct Lake mode (not Import, not DirectQuery).
- Build visuals and test performance.
What you can test:
- Real‑time BI
- Direct Lake performance
- Star schema queries
- No‑import behavior
What costs money:
- Nothing during the trial.
- What You MUST Avoid to Stay at $0 To guarantee no charges:
Avoid these:
- ❌ Creating a dedicated capacity (F2, F4, F8, etc.)
- ❌ Assigning a workspace to a paid capacity
- ❌ Using Fabric outside the trial window
- ❌ Using features that explicitly require a paid SKU (rare)
Safe behaviors:
- ✔ Keep all workspaces in the default trial capacity
- ✔ Do not modify capacity settings
- ✔ Do not enable autoscale or reserved capacity
If you follow these rules, you will not incur charges.
- The Best Zero‑Cost Learning Workflow Here is the recommended sequence for learning Fabric with zero cost:
Step 1 — Create a workspace in the trial capacity This ensures everything is free.
Step 2 — Create a SQL Database (OLTP)
- Create tables
- Insert data
- Watch OneLake Mirroring create Delta Lake files
Step 3 — Create a Warehouse
- Create analytical tables
- Load data
- Query with T‑SQL
Step 4 — Create a Lakehouse
- Use Spark to read both OLTP‑mirrored and Warehouse tables
- Validate Delta Lake interoperability
Step 5 — Create a Power BI model in Direct Lake mode
- Connect to Warehouse tables
- Build visuals
- Validate real‑time performance
This gives you a complete understanding of:
- OLTP engine
- Warehouse engine
- Spark engine
- VertiPaq engine
- Delta Lake
- OneLake
- Direct Lake
All without spending a cent.
- Summary You can test every major component of Fabric — OLTP, Warehouse, Lakehouse, Spark, and Power BI Direct Lake — for free using the Fabric Trial, as long as you:
- Keep all workspaces in the trial capacity
- Avoid creating dedicated capacities
- Stay within the trial window
This gives you full access to:
- Fabric SQL Database (OLTP)
- Fabric Warehouse (SQL analytics)
- Lakehouse (Spark)
- Power BI Direct Lake (VertiPaq)
- Delta Lake storage
- OneLake mirroring
No charges, no tricks, no hidden costs.