Microsoft today announced “Mirroring,” a new feature for its Microsoft Fabric product that allows enterprise customers to replicate and manage external databases within Microsoft’s data warehousing environment. This capability, one of several updates revealed today, is particularly significant because it simplifies the sharing of data from various databases, even those with proprietary formats. This helps customers consolidate their data in Microsoft Fabric, effectively reducing costs.
Microsoft continues to put pressure on its primary cloud computing competitors, Amazon and Google. While Amazon has been slower to adopt open formats, Google has been actively developing open formats but lacks the extensive customer base and service range that Microsoft enjoys. Additionally, Microsoft’s new feature also competes with emerging analytics companies like Databricks and Snowflake by making data sharing more accessible.
The introduction of Mirroring is part of Microsoft Fabric’s broader aim to help companies access, analyze, and understand their data uniformly, regardless of where it resides. However, this push towards openness is becoming a standard expectation among cloud data providers, most of which offer various means to replicate or move data from established on-premise databases.
Other updates to Microsoft Fabric include its move to general availability, having already secured 25,000 customers since its preview launch in May. Fabric’s Copilot feature, which uses natural language processing to create data flows, write SQL statements, build reports, and develop machine learning models, has now entered public preview. Moreover, Microsoft announced an expansion of its ISV ecosystem within Microsoft Fabric.
Mirroring helps companies avoid expensive SQL calls to external databases and eliminates the need for complex migration projects—a process Microsoft refers to as the “integration tax.” This feature allows customers to replicate existing cloud data warehouses and databases into open source formats such as Parquet and Delta, making them accessible within the Fabric environment. Initially, this technology will support Azure Cosmos DB, Azure SQL DB, Snowflake, and MongoDB, with more data sources to be added in 2024.
Mirroring builds on the “Shortcuts” feature introduced in May, which supports multicloud scenarios. This feature lets Fabric’s OneLake virtualize data storage in Amazon’s S3 and Google’s storage without moving or replicating the data, and is now generally available. However, shortcuts only work on databases in open source formats like CSV and Parquet.
The London Stock Exchange (LSE), which holds one of the largest data repositories in the financial industry, is collaborating with Microsoft to test the Fabric offering, including the mirroring technology. Although still in its early stages, there are indications that Microsoft might realize its vision of a “Cloud 2.0” through Fabric.
Microsoft’s Ulagaratchagan highlighted that 67% of the Fortune 500 are using Fabric, with 85% utilizing three or more workloads within the platform. He also explained that Mirroring uses publicly documented APIs from database and data warehouse vendors to create and maintain near real-time database snapshots, synchronizing them via change data capture feeds. This feature integrates seamlessly with other Microsoft tools like Power BI, offering significant performance improvements and cost reductions as it eliminates the need for extensive SQL queries.
Customers appreciate that Fabric integrates Azure’s Open AI generative AI capabilities, thereby simplifying data architecture and unifying the business model, requiring only one product to power all data workloads.
Microsoft aims to attract as much data as possible into Fabric and its supporting cloud platform, Microsoft Azure. This strategy also involves drawing data from competitors like Snowflake, known for its proprietary data environment and perceived high costs. Microsoft also has a strategic partnership with Databricks, integrating closely with Azure customers via Azure Databricks.
Fabric’s integration with AI and data lake technologies positions it strongly against competitors, aiming to become the leading enterprise data provider and capitalize on the AI era’s growth potentials.