Microsoft Fabric – General Availability
- Microsoft Fabric Overview
- Hottest Announcements from Ignite 2023
- Migration Paths to Microsoft Fabric
Since May 2023 Microsoft Fabric – a brand new analytics platform from Microsoft – is in public preview. Today Microsoft has announced General Availability (GA) of Microsoft Fabric at Ignite – an annual conference for IT professionals.
Obungi had the great honor of gaining early insights into Microsoft Fabric and was able to regularly share feedback with the product team during the preview phase. It’s been exciting to see Microsoft Fabric improve week on week and today is not the end of the journey – it’s just the beginning!
As with Power BI, Microsoft Fabric will see regular innovations and the platform will continue to evolve, not least through the use of AI, including amazing copilots.
Below we provide a brief overview of Microsoft Fabric and summarize the numerous announcements made at Ignite. Microsoft has also announced new details on migration scenarios and pricing, which are also discussed including tips for getting started. Finally, we provide an outlook for the future of Microsoft Fabric.
Microsoft Fabric Overview
Microsoft Fabric is an all-in-one analytics solution for enterprises that covers the full range of data, analytics and AI (e.g., data movement, data science, Real-Time Analytics, business intelligence etc.). It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place.
Watch the following video to get a quick overview.
Microsoft Fabric has the following components, called workloads:
• Data Factory (DF) for data integration,
• Synapse Data Engineering (DE) for building lakehouses,
• Synapse Data Warehouse (DW) for data warehousing,
• Synapse Data Science (DS) for using machine learning models and experiments,
• Synapse Real Time Analytics (RT) for real-time data processing,
• Power BI (PB) for business intelligence including DirectLake mode,
• Data Activator (DA) for automatically taking actions when patterns or conditions are detected in changing data.
All these workloads interact with data which is stored in OneLake (OL) – the OneDrive for data. OneLake provides a data lake as a service without the need to build it – it’s just there.
Microsoft Fabric is lake-centric and open, i.e. all data which is used by the above workloads is stored in Delta parquet format. Microsoft Fabric is a Software as a Service (SaaS) offering which reduces the time to getting started to a minimum as the only thing which is needed is a license – the same approach which made Power BI so successful in the last years.
Further information on Microsoft Fabric can be found in the documentation.
Hottest Announcements from Ignite 2023
The most important announcement today at Ignite is certainly that Microsoft Fabric is now General Available (GA). This means that Microsoft Fabric can now be used for productive purposes with a clear conscience, as appropriate support is guaranteed.
In addition, Microsoft has made numerous other announcements at Ignite. Below we summarize what we consider to be the most important announcements. To make it as clear as possible, we have divided the announcements into categories such as General (GE) and Fabric Components (DA, DE, DF, DS, DW, OL, PB, RT), added the status (GA: General Available, PP: Public Preview, CS: Coming Soon) and given our assessment of the Business Value via a heat level (hot: 🔥 / very hot: 🔥🔥 / incredibly hot: 🔥🔥🔥). Please keep in mind that these announcements are just a selection of our favorites and even more exist.
|GE||Reservation pricing for Fabric||Announce reservation pricing for Fabric that will allow you to pre-commit Fabric Capacity Units in one-year increments, helping you save up to ~40.5% over the pay-as-you-go prices.||GA||🔥🔥🔥|
|GE||Tighter integration with Microsoft Purview||Support of Microsoft Purview Information Protection Sensitivity labels and Data loss prevention (DLP) policies.||CS||🔥🔥|
|GE||Microsoft Fabric data Catalog||Instance of Fabric will auto-attach to a preview version of Microsoft Purview and will integrate into Purview’s Data Catalog.||PP||🔥🔥|
|GE||Data residency & Auditability||Data residency where data never leaves the region boundary and end-to-end auditability for all user and system operations.||GA||🔥🔥|
|GE||Purview hub||Easy access to all Purview capabilities for Fabric admins and Fabric data owners.||PP||🔥|
|GE||Expansion of Microsoft Intelligent Data Platform ISV ecosystem||Independent Software Vendors (ISVs) can make their applications discoverable from with-in the Fabric experience by extending Fabric’s capabilities by building a Fabric workload.||CS *)||🔥|
|DE||General Availability of Synapse Data Engineering||Workload can be used for productive use cases.||GA||🔥🔥🔥|
|DE||CI/CD integration incl. GIT support||Support CI/CD integration (Git + Deployment pipelines) for notebooks and the lakehouse.||PP||🔥🔥🔥|
|DE||New Fabric runtime||When working with Spark, one can make use of a new Fabric runtime which includes Spark 3.4, Delta 2.4, Java 11 and Python 3.10.||PP||🔥🔥🔥|
|DE||Environment item||Configuration & management of all settings and libraries in a centralized place that can be attached to notebooks and Spark jobs.||PP||🔥🔥🔥|
|DE||Copilot in Data Engineering||Generate code in notebooks for common tasks like data exploration and data preparation.||PP||🔥🔥🔥|
|DE||New remote Synapse VS Code extension||Work with lakehouses, notebooks and Spark job definitions using the new fully remote Synapse VS Code extension.||PP||🔥🔥|
|DE||REST-API support||Programmatically access to Data Engineering items via REST API support.||PP||🔥🔥|
|DF||General Availability of Data Factory||Workload can be used for productive use cases.||GA||🔥🔥🔥|
|DF||Virtual Net Data Gateway||The VNet data gateway helps customers to connect from Fabric Dataflows Gen2 to their Azure data services within a Virtual Network (VNet) without the need of an enterprise data gateway.||PP||🔥🔥|
|DF||Microsoft Graph Data Connect Fabric Native Integration||Using the Copy Activity in data pipelines, one can now connect to organization Microsoft 365 data and bring them into Microsoft Fabric for data analytics.||PP||🔥|
|DS||General Availability of Synapse Data Science||Workload can be used for productive use cases.||GA||🔥🔥🔥|
|DS||Copilot in Data Science||Generate code in notebooks for common tasks like data exploration and machine learning.||PP||🔥🔥🔥|
|DS||MLFlow integration in notebooks||MLflow inline authoring widget, enabling users to track their experiment runs along with metrics and parameters, all directly from within their notebook.||GA||🔥🔥|
|DS||REST-API support||New REST APIs enable to work with models and experiments programmatically.||PP||🔥🔥|
|DS||Synapse ML 1.0||We are also excited to announce the GA of Synapse ML 1.0, an open-source ML library for Spark that simplifies the application of machine learning at scale.||GA||🔥🔥|
|DW||General Availability of Synapse Data Warehouse||Workload can be used for productive use cases.||GA||🔥🔥🔥|
|DW / PB||Direct Lake mode to support Power BI semantic models on Fabric Warehouses||Semantic models can now leverage Direct Lake mode in conjunction with Synapse Data Warehouses in Microsoft Fabric.||GA||🔥🔥🔥|
|DW||GIT support||GIT support can be expected in the upcoming weeks.||CS||🔥🔥🔥|
|DW||Performance Improvements||Several performance improvements in the background (e.g. automatically compacting parquet files of the Warehouse managed tables, a new parser with enhanced CSV file ingestion time, caching of metadata, improved assignment of compute resources to milliseconds etc.).||CS||🔥🔥🔥|
|DW||Mirroring in Fabric||Mirroring provides a modern way of accessing and ingesting data continuously and seamlessly from any database or data warehouse into the Data Warehousing experience in Fabric.||CS||🔥🔥🔥|
|DW||Deployment support||New experiences to develop and deploy data warehouse applications via SQLPackages and REST APIs.||GA||🔥🔥|
|DW||UI improvements||Several user interface improvements like one-click cloning of tables, saving results as a table, or saving a query as a View etc.||GA||🔥🔥|
|DW||Query Insights||Monitoring via Query Insights in the UI (e.g. query history, long running queries etc.).||GA||🔥|
|DW||SQL Dynamic Data Masking (DDM)||DDM allows you to define masking rules for specific columns ensuring that sensitive information is only viewed by authorized users.||GA||🔥|
|OL||General Availability of One Lake||Workload can be used for productive use cases.||GA||🔥🔥🔥|
|PB||Power BI Copilot||Creating reports, DAX calculations via natural language.||PP||🔥🔥🔥|
|PB||“Explore” that helps you learn more about your semantic model||Enables anyone to quickly explore a semantic model via the Power BI Service.||PP||🔥🔥|
|PB||DAX Query View in Power BI Desktop||Model authors can quickly validate data and measures in their semantic model in the new DAX Query view in Power BI Desktop (e.g. generate DAX queries, update the model etc.).||GA||🔥🔥|
|PB||RLS/OLS security and stored credentials for Direct Lake semantic models||RLS/OLS security is feature to define row-level and object-level access rules in a DirectLake semantic model, so that different users can see different subsets of the data based on their roles and permissions.||PP||🔥🔥🔥|
|PB / OL||Microsoft OneLake integration for import models||Enable OneLake integration and automatically write data imported into semantic models to delta tables in OneLake.||PP||🔥🔥🔥|
|PB||Semantic model scale out||Power BI automatically scales out read-only replicas to ensure performance doesn’t slow in case of high user concurrency.||GA||🔥🔥🔥|
|PB||Shareable cloud connections (SCC)||Shareable cloud connections (SCC) for semantic models and paginated reports.||GA||🔥🔥|
|RT||General Availability of Synapse Real Time Analytics||Workload can be used for productive use cases.||GA||🔥🔥🔥|
|RT||IoT Hub as a source||Eventstream now supports Azure IoT Hub as a source.||GA||🔥🔥🔥|
|RT||Stream transformation before sending to KQL DB as a destination||One can now do stream transformation on incoming events from Eventstream before sending the events to the KQL Database.||GA||🔥🔥|
|RT/DA||Data Activator/Reflex as a destination||Eventstream now supports Data Activator as a destination.||GA||🔥🔥|
|RT||AMQP and Kafka format connection strings||Eventstream now offers support for AMQP and Kafka format connection strings in both Custom App source and destination||GA||🔥🔥|
*) Currently available by invitation only
Migration Paths to Microsoft Fabric
Azure Synapse Analytics is Microsoft’s well established Platform as a Service (PaaS) offering for enterprise analytics. Microsoft has confirmed once again at Ignite that no current plans exist to retire Azure Synapse Analytics. Azure Databricks will also remain a first party service for data, analytics & AI in Azure.
In addition to these two established PaaS solutions, Microsoft Fabric is now a production-ready Software-as-a-Service (SaaS) analytics platform that will play an increasingly important role in the future and can be used for greenfield approaches, i.e. building a new data platform, as a result of the GA announcement.
For a migration towards Microsoft Fabric, a step-by-step approach can be taken due to the support of open formats such as Delta Parquet and the lake-centric architecture. It is also possible to use existent data platforms together with Microsoft Fabric.
In the following, the migration approaches for the various aspects of a data platform are illustrated including today’s announcements.
Migration of Data Lakes
Let’s assume you organize your data in an Azure Data Lake Storage Gen2 (ADLSg2) lakehouse. For data movement, you use Synapse pipelines based on Spark pools and transfer the data to a Synapse SQL Dedicated Pool and/or make the data accessible for reporting via serverless SQL pools for Power BI. This architecture is shown in the following diagram.
Based on this setup, Microsoft Fabric can be introduced step by step, i.e. a Data Engineering Lakehouse in Microsoft Fabric can make use of the data in the (ADLSg2) via Shortcuts without moving the data. With the new DirectLake mode, Power BI can operate directly over the Lakehouse with performance similar to the import modus and without processing time and restrictions to the size of the dataset. The data remains in your data lake.
The architecture can even be simplified:
• The SQL Analytics Endpoint in Microsoft Fabric allows you to apply the security rules from the Dedicated Pool directly over the Lakehouse, i.e. there is no need for a dedicated capacity anymore and data copying from the lake to your warehouse can be avoided.
• When you transition your Notebooks and Spark Jobs to Fabric Spark, your Lakehouse data will undergo automatic optimization for compatibility with all other Fabric engines, all while being stored in an open format.
Microsoft announced today to invest in developing migration processes and tools, especially for customers using Azure Synapse Analytics in order to make migration effort es easy as possible.
For customers using Azure Databricks it is also possible to make use of Microsoft Fabric features, i.e. both products can co-exist together as described in this blog post from Microsoft.
Migration of Pipelines
As announced today, Microsoft is working towards the preview of the following features for Data Factory migrations by Q2/2024:
• Mounting of Azure Data Factory in Fabric which enables customers to be able to mount their existing Azure Data Factory (ADF) in Microsoft Fabric. All ADF pipelines will work as-it-is, and continue running on Azure, while using Microsoft Fabric.
• Upgrade from Azure Data Factory pipelines to Fabric, i.e. Microsoft will deliver an upgrade experience that empowers customers to test existing data pipelines in Microsoft Fabric using mounting and upgrading data pipelines.
Migration of Data Warehouses
The architectural changes of the Fabric Data Warehouse cannot be backported to either one of the old Gen2 engines (Dedicated and Serverless SQL Pools). Data stored in a proprietary format in Gen2 needs to be extracted and stored in the open Delta parquet format of Microsoft Fabric. To enable a smooth migration, Microsoft has announced the following which is available now:
• Ability to export Dedicated SQL Pool to a SQL Project and import it in Microsoft Fabric.
• PowerShell scripts that convert Gen2 DDL to Microsoft Fabric supported DDL.
• Detailed migration documents with best practices. In addition, Microsoft has started working on an in-product Migration Assistant that will automatically detect and convert Synapse Gen2 code to Fabric Data Warehouse code including redirection of endpoints. Microsoft targets this to be delivered in 2024.
Migration of Lakehouses
Microsoft has announced to start working on an in-product migration assistant. Furthermore a new Azure Synapse Spark to Fabric Migration Guidance whitepaper has been published.
Migration of Data Science Workloads
The migration path for a data scientist in Azure Synapse Analytics is similar to that of a Spark data engineer – they will need to consider their notebooks, Spark pools and data. Microsoft recommends starting with the Azure Synapse Spark to Fabric Migration Guidance whitepaper.
Migration of Real Time Analytics
Fabric KQL databases are based on the same technology as Azure Data Explorer (ADX) and Azure Synapse Data Explorer. It means that all current applications, SDK, integrations, and tools that work with ADX will continue to work smoothly with Fabric KQL databases.
There is a broad set of capabilities to support mixed environments and migrations, some are available now and some will be made available in the next months.
• Full binary compatibility of APIs, SDKs and tools.
• Create a database shortcut to host a read only, in place, up to date instance of the database in Fabric.
Coming of the next months:
• Migrate an Azure Synapse Data Explorer pool from a Synapse workspace and attach it to a Fabric workspace
• Attach an Azure Data Explorer cluster to a Fabric workspace
• Sync Azure Data Explorer user queries and dashboards into a Fabric workspace query sets and dashboards
New customers can explore Fabric’s offerings through a 60-day free trial on https://app.fabric.microsoft.com. During the trial, each user receives a 64 CU trial capacity for any workload. Existing Power BI Premium customers can access Fabric by enabling it in their Fabric admin portal.
For those interested in purchasing Fabric, there are important updates to licensing and pricing. In June 2023, Microsoft announced pay-as-you-go prices for Microsoft Fabric that allow customers to dynamically scale up or scale down and pause capacity as needed. Today, Microsoft announced reservation pricing, allowing pre-commitment in one-year increments and saving 40.5% compared to pay-as-you-go prices (excluding Power BI Capacity SKUs). Microsoft has also announced OneLake BCDR and cache storage prices. All these pricing options can be found on the Microsoft Fabric pricing page.
With the General Availability (GA) of Microsoft Fabric, it can be used for productive cases from today.
Its integration into the full Microsoft ecosystem makes it a unique platform for all data, analytics and AI use cases. It can also be easily combined with an existing data platform and integrations are supported by tools and processes.
The experience of recent months has shown the enormous pace of innovation and the future looks bright, as Fabric is being continuously enhanced, which is regularly published on the Microsoft Fabric Roadmap.
Are you looking for a partner for the implementation and use of Microsoft Fabric? Get in touch with us!