Microsoft fabric overview

Microsoft Fabric Architecture: A Deep Technical Overview of OneLake, Delta Lake, Lakehouse, Warehouse, and Power BI Direct Lake

Introduction Microsoft Fabric is a unified analytics platform built around a single architectural principle:

All data lives in OneLake as Delta Lake tables, and multiple compute engines operate on the same files.

This document explains the underlying technologies, how each engine interacts with Delta Lake, and how compute is performed across Lakehouse, Warehouse, SQL Database, and Power BI Direct Lake.

OneLake: The Universal Storage Layer OneLake is the foundational storage system for Microsoft Fabric. It is:

A single, tenant-wide data lake
Built on Azure Data Lake Storage Gen2
Using Delta Lake as the universal table format

All Fabric workloads — SQL, Spark, Power BI, Data Factory, Real-Time Analytics — read and write data stored in OneLake.

2.1 Storage Format All structured data in OneLake is stored as:

Parquet files (columnar storage)
Delta Lake transaction logs (ACID metadata)

This combination provides:

ACID transactions
Time travel
Schema enforcement
Concurrent writes
Efficient columnar analytics

Delta Lake: The Table Format Used Across Fabric Delta Lake is an open table format consisting of:

3.1 Parquet Data Files These contain the actual columnar data:

part-00000-...snappy.parquet part-00001-...snappy.parquet ...

3.2 The deltalog Folder This folder contains:

JSON transaction log files
Periodic Parquet checkpoints

Example:

deltalog/ 00000000000000000000.json 00000000000000000001.json 00000000000000000002.json 00000000000000000003.checkpoint.parquet

Each JSON file describes:

Added Parquet files
Removed Parquet files
Schema changes
Metadata updates

This log-based architecture enables:

ACID transactions
Time travel
Optimistic concurrency
Efficient metadata operations

Fabric Lakehouse The Lakehouse is the Spark-first compute environment in Fabric. It is designed for:

Data engineering
ELT pipelines
Machine learning
Notebook-based development
Direct file access

4.1 Storage Lakehouse stores all tables as Delta Lake (Parquet + transaction logs) in OneLake.

4.2 Compute Engine Lakehouse uses:

Apache Spark
Spark SQL
Python, Scala, SQL notebooks

4.3 Characteristics

Flexible schema
Supports raw files, semi-structured data, and Delta tables
Ideal for ingestion, transformation, and ML workloads
Not optimized for BI star-schema workloads

Fabric Warehouse The Warehouse is the SQL-first compute engine in Fabric. It is designed for:

BI workloads
Star schemas
SQL-based transformations
High-concurrency analytical queries

5.1 Storage Warehouse also stores all tables as Delta Lake in OneLake.

5.2 Compute Engine Warehouse uses a dedicated distributed SQL engine, separate from Spark.

Characteristics:

T‑SQL surface area
Columnar execution
Materialized views
Schema enforcement
Optimized for BI workloads

5.3 Comparison to Snowflake Fabric Warehouse is conceptually similar to Snowflake:

Separation of compute and storage
Distributed SQL engine
Columnar processing
ACID tables

The difference is that Fabric uses Delta Lake instead of Snowflake’s proprietary micro-partition format.

Fabric SQL Database (OLTP) Fabric SQL Database is the OLTP engine in Fabric. It is:

The same engine as Azure SQL Database
Optimized for transactional workloads
Not designed for large-scale analytics

6.1 OneLake Mirroring Fabric SQL Database automatically mirrors its tables into OneLake as Delta Lake tables.

This enables:

Spark access
Warehouse SQL access
Power BI Direct Lake access

6.2 Important Limitation Mirroring is one-way:

SQL Database → Delta Lake (automatic)
Delta Lake → SQL Database (manual ETL required)

Power BI Direct Lake Power BI Direct Lake is a new mode that allows Power BI to read Delta Lake tables directly from OneLake without:

Importing data
Using DirectQuery
Using the Warehouse SQL engine
Using Spark

7.1 Compute Engine Direct Lake uses the VertiPaq engine, the same in-memory columnar engine used by:

Power BI Import mode
SSAS Tabular
Azure Analysis Services

7.2 How It Works VertiPaq:

Reads Parquet files directly
Interprets Delta transaction logs
Loads data into memory on demand
Executes queries with columnar vectorized execution

7.3 Characteristics

Near-import performance
Real-time freshness
Zero data movement
No SQL pushdown

Direct Lake is effectively a third compute engine in Fabric.

How All Engines Share the Same Data All engines operate on the same Delta Lake tables stored in OneLake:

Engine	Purpose	Reads Delta Lake?	Compute Type
Lakehouse (Spark)	Engineering, ML	Yes	Spark
Warehouse (SQL)	BI, star schemas	Yes	Distributed SQL
SQL Database (OLTP)	Transactions	Mirrors to Delta	Azure SQL engine
Power BI Direct Lake	BI visualization	Yes	VertiPaq

This architecture eliminates data silos and enables multi-engine analytics on a single copy of data.

Summary Microsoft Fabric is built on a unified architecture:

OneLake is the universal storage layer
Delta Lake is the universal table format
Lakehouse uses Spark compute
Warehouse uses a distributed SQL engine
SQL Database mirrors OLTP tables into Delta Lake
Power BI Direct Lake uses VertiPaq to read Delta files directly

The result is a platform where:

Data is stored once
Multiple engines compute on the same files
No ETL is required between analytical engines
BI workloads can achieve real-time performance without imports

This architecture combines the strengths of:

Databricks (Spark + Delta Lake)
Snowflake (SQL warehouse)
Azure SQL (OLTP)
Power BI (semantic modeling + VertiPaq)

into a single integrated system.

Direct Lake is the VertiPaq engine reading Delta Lake files directly from OneLake.

Steps:

Create a Power BI semantic model in the same workspace.
Point it at your Warehouse or Lakehouse tables.
Ensure the model is in Direct Lake mode (not Import, not DirectQuery).
Build visuals and test performance.

What you can test:

Real‑time BI
Direct Lake performance
Star schema queries
No‑import behavior

What costs money:

Nothing during the trial.

What You MUST Avoid to Stay at $0 To guarantee no charges:

Avoid these:

❌ Creating a dedicated capacity (F2, F4, F8, etc.)
❌ Assigning a workspace to a paid capacity
❌ Using Fabric outside the trial window
❌ Using features that explicitly require a paid SKU (rare)

Safe behaviors:

✔ Keep all workspaces in the default trial capacity
✔ Do not modify capacity settings
✔ Do not enable autoscale or reserved capacity

If you follow these rules, you will not incur charges.

The Best Zero‑Cost Learning Workflow Here is the recommended sequence for learning Fabric with zero cost:

Step 1 — Create a workspace in the trial capacity This ensures everything is free.

Step 2 — Create a SQL Database (OLTP)

Create tables
Insert data
Watch OneLake Mirroring create Delta Lake files

Step 3 — Create a Warehouse

Create analytical tables
Load data
Query with T‑SQL

Step 4 — Create a Lakehouse

Use Spark to read both OLTP‑mirrored and Warehouse tables
Validate Delta Lake interoperability

Step 5 — Create a Power BI model in Direct Lake mode

Connect to Warehouse tables
Build visuals
Validate real‑time performance

This gives you a complete understanding of:

OLTP engine
Warehouse engine
Spark engine
VertiPaq engine
Delta Lake
OneLake
Direct Lake

All without spending a cent.

Summary You can test every major component of Fabric — OLTP, Warehouse, Lakehouse, Spark, and Power BI Direct Lake — for free using the Fabric Trial, as long as you:

Keep all workspaces in the trial capacity
Avoid creating dedicated capacities
Stay within the trial window

This gives you full access to:

Fabric SQL Database (OLTP)
Fabric Warehouse (SQL analytics)
Lakehouse (Spark)
Power BI Direct Lake (VertiPaq)
Delta Lake storage
OneLake mirroring

No charges, no tricks, no hidden costs.