Microsoft Fabric gets new generative AI tech, more openness
Microsoft unveiled new features for Fabric, including an AI assistant in Power BI, real-time analytics capabilities and support for Apache Iceberg.
The features were revealed on May 21 during Build, a Microsoft user conference for developers hosted both virtually and in-person in Seattle.
Fabric, first unveiled in May 2023 and made generally available in November, is an AI-powered data management and analytics suite that joins the capabilities of Data Factory, Azure Synapse Analytics and Power BI in a single platform. The combination aims to enable seven data workloads including integration, management and analysis.
When the vendor first introduced the platform a year ago, Microsoft chairman and CEO Satya Nadella called it the tech giant’s most important new data-related set of tools since the launch of SQL Server in 1989.
Arun Ulagaratchagan, Microsoft’s corporate vice president of Azure Data, said during a breakout session at Build that the tech giant created Fabric based on several main principles. They include empowering all business users, providing a complete analytics platform that unifies analysis in a single environment with governance and security, powering the platform with AI Copilots to aid exploration and fuel better insights, and being data lake-centric and open.
The new features address those principles and help move Fabric toward Microsoft’s goal of making it a single location for an organization’s data needs, according to Doug Henschen, an analyst at Constellation Research.
“The Build announcements point to a Fabric offering that is more hybrid and multi-cloud capable and one that lives up to the idea of fabrics, which is to let you access the data where it lives without having to move or copy,” he said.
New capabilities
Generative AI has been the dominant trend in data management and analytics ever since OpenAI’s launch of ChatGPT in November 2022 significantly improved the capabilities of large language models.
Data management and analytics platforms have historically been complex, requiring coding knowledge to work with data and data literacy training to interpret it. Vendors have tried to simplify use of their tools, but despite those efforts data management and analytics have remained specialized skills with only a quarter of employees within organizations having the expertise to work with data.
Generative AI has the potential to change that by enabling true natural language interactions. When combined with an organization’s proprietary data, generative AI models reduce coding requirements and lessen the need for data literacy training.
As a result, numerous vendors have made generative AI the main focus of their product development. Many are developing AI assistants that enable users to essentially converse with their data in natural language one of the most developed generative AI tools.
For example, Microsoft rival Amazon’s Q is now generally available as are assistants from vendors such as Informatica and MicroStrategy.
Microsoft first unveiled Copilot — its term for an AI assistant — in 2021. Since then, it has added generative AI to Copilots and launched such assistants for Office 365, Teams and a host of other tools.
Copilot in Fabric was introduced in public preview in November. Now, Copilot in Fabric is generally available for Power BI.
With this feature, Power BI users can use natural language to quickly visualize data and develop reports, create dataflows, build machine learning models and generate summaries.
Combined, they represent a robust set of capabilities that go beyond the simple question-and-answer capabilities of some AI assistants and have the potential to help more non-technical users work with Power BI, according to Henschen.
“Rather than just serving up text-based answers and simple visuals in response to a natural language question, these multi-faceted responses provide more context and possibilities for exploration in case the question wasn’t quite as clear as it could be,” he said. “This will be particularly helpful for inexperienced users and I think we’ll see assistants and copilots take this approach.”
However, while significant given how it will potentially enable non-technical workers to use Power BI and improve the efficiency of data experts, Henschen noted that Microsoft has been slow to roll out its generative AI capabilities in Fabric.
Despite developing Copilots for three years, Power BI is the first tool within Fabric — and only one to date — with a generally available Copilot.
Meanwhile, other tech giants such as AWS and Google have been faster to add AI assistants to data management and analytics tools.
“Microsoft clearly has comprehensive plans to have Copilots within every service. But for all its talk about GenAI over the last year, I’m surprised that so few Copilots are generally available at this point,” Henschen said.
As important as AI assistants are, Microsoft’s plan to add support for Apache Iceberg in Fabric’s OneLake data lake service is every bit as significant as the Copilot for Power BI, according to Henschen.
While the Copilot addresses the AI principle of Fabric, support for Apache Iceberg addresses the openness of the platform.
Fabric already supports Delta Lake, the open source storage format developed by Databricks and used to develop Azure Databricks in Fabric. With Azure Databricks, Microsoft and Databricks have a bi-directional integration that enables joint customers to use Databricks and Fabric in concert with each other.
Just as Databricks uses Delta Lake for storage format, Databricks competitor Snowflake uses Apache Iceberg.
Apache Iceberg in Fabric OneLake — still under development — represents expansion of the partnership between Microsoft and Snowflake and will provide joint Snowflake and Fabric users with the bi-directional integration capabilities provided by Azure Databricks.
“The biggest Fabric news from Build [is] the opening up of the platform with the addition of Apache Iceberg support, and with it, bi-directional integration with Snowflake,” Henschen said. “This puts Snowflake on an equal footing in the Fabric ecosystem with Azure Databricks, which was previously supported through the Delta standard.”
Beyond Apache Iceberg in Fabric OneLake and Copilot in Fabric for Power BI, new Fabric capabilities include the following:
- AI skills, a generative AI tool in preview that enables users to ask questions of their data without having to configure the data beforehand.
- Real-Time Intelligence, a feature now in preview aimed at enabling real-time decisions by combining streaming analysis and real-time monitoring capabilities to provide visibility into streaming data from when it’s first ingested.
- A hub for streaming data from Fabric, other Microsoft data sources, third-party cloud providers and external data sources now in preview that aims to simplify the discovery and management of data in motion.
- Fabric Workload Development Kit, a software development kit currently in preview that enables users to build applications as native workloads within Fabric so the applications provide a consistent experience for end users in their Fabric environment.
- An API for open source query language GraphQL in preview aimed at helping data experts access data from multiple sources within Fabric using a single query API.
- A OneLake shortcut in preview that enables users to connect to data from on-premises and other network-restricted sources.
- Expanded partnerships with Adobe and Databricks that add to previously existing integrations with Fabric.
Ulagaratchagan noted that data fragmentation is a substantial problem for many organizations.
One of the goals of Fabric is to bring an organization’s data together in one platform to eliminate that fragmentation, with the aim of enabling Fabric users to operationalize their data to inform AI models and applications.
Updates such as the OneLake shortcut, expanded partnerships and support for Apache Iceberg address data fragmentation and help customers access high quality data to inform decisions.
“AI is rapidly changing the world,” Ulagaratchagan said. “But the AI revolution is based on data … so it’s become incredibly important for every customer, every organization, every developer to get their data estate ready to leverage the power of AI. Unfortunately, it’s a lot harder than it needs to be because of a ton of fragmentation.”
Conversations with customers often focus on reducing fragmentation and unifying data, he continued.
“This is what we’re doing with Microsoft Fabric,” Ulagaratchagan said. “We’re going from isolated components to a unified stack.”
However, making it easy to reduce fragmentation is a work in progress, Henschen noted. As a result, it’s significant that Microsoft is adding features that help Fabric users avoid having to onerously copy and move data to make it unified.
“Microsoft has differentiated Fabric with the promise of using one data platform to support [numerous workloads],” Henschen said. “These Build announcements … are important moves that address what was a weakness of Fabric in previously requiring more loading or copying of data into Azure.”
Next steps
While much of what Microsoft unveiled for Fabric this week is in preview, the tech giant has a detailed roadmap for further Fabric development.
Copilot for Data Science and Data Engineering in Microsoft Fabric and Copilot for Data Factory are both scheduled for general availability during the third quarter. Data governance measures planned for the next few months include master data management capabilities and secure connectivity to between Fabric and Azure Storage accounts. And Power BI updates will include new authoring capabilities that include natural language processing.
“Fabric is massive for Microsoft,” Ulagaratchagan said.
Henschen, meanwhile, said Microsoft would be wise to address cost control as it continues to develop Fabric.
As organizations have migrated more of their data to the cloud, and as they have begun to develop more AI models and applications amid a surge of interest fueled by generative AI, cloud computing expenditures have often exceeded expectations.
Any measures that vendors can take to help customers control spending, including governance that limits use, are therefore critical.
“Despite Microsoft’s sweeping assurances of ease of use, there’s a lot of complexity in managing and planning capacity and controlling costs across seven types of workloads,” Henschen said. “I want to see … rich and robust centralized cost and governance visibility, management capabilities and guardrails. Vendors that ignore cost considerations inevitably end up losing market share.”
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.