Databricks Data and AI Summit 2024: The biggest innovations
It’s time to celebrate the incredible women leading the way in AI! Nominate your inspiring leaders for VentureBeat’s Women in AI Awards today before June 18. Learn More
Databricks’ annual summit has always been a party for data ecosystem stakeholders. The company shares new technologies, partnerships and developments that make working with data assets – whether structured or unstructured – easier than ever. This year, the summit saw the same party continue, albeit with one major (and expected) shift: a focus on AI.
In his keynote, CEO Ali Ghodsi shared several innovations at the intersection of data and AI as part of the company’s broader effort to help teams make the most of their governed datasets on the Databricks Data Intelligence Platform. This included upgrades to Mosaic AI, the company’s platform for AI development, a new model for image generation and a generative AI-driven offering for better and faster data analytics.
Below is a rundown of all major announcements:
1. Unity Catalog goes open-source
Taking on Snowflake’s Polaris Catalog, Databricks open-sourced its Unity Catalog under an Apache 2.0 license with OpenAPI specification, server, and clients. The move means other firms can take the underlying architecture and code to set up their catalogs supporting data in any format, including Iceberg and Delta/Hudi via UniForm, and interoperability with all major cloud platforms and compute engines. The code for the catalog was published live on stage, while Polaris Catalog is expected to go open source over the next 90 days.
VB Transform 2024 Registration is Open
Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now
Mosaic AI, the company’s suite of tools for building AI applications, got a major upgrade to help teams build trusted, production-grade compound AI systems. This included a new Mosaic AI Model Training product, an AI Agent framework, an Evaluation framework as well as an AI Tools Catalog and AI Gateway for governance and trust. All offerings, except the AI tools, are in public preview starting today.
3. New text-to-image model for enterprises
Databricks also announced the private preview launch of Shutterstock ImageAI, a text-to-image generative AI model that provides enterprises with high-fidelity, trusted images for different business use cases. The model was pre-trained with Mosaic AI, using Shutterstock’s trusted image collection.
It is live on Shutterstock’s image generator and will be available for fine-tuning via Mosaic AI as well as for application integration via API.
4. Databricks AI/BI for intelligent analytics
For enterprises looking to democratize access to analytics and insights, Databricks announced the launch of Databricks AI/BI, a compound AI system that sits atop Databricks Data Intelligence Platform and utilizes an ensemble of AI agents (Dashboards and Genie) to reason about business questions and generate useful natural language answers and visualizations.
Each agent is responsible for a narrow, but important task, such as planning, SQL generation, explanation, visualization and result certification. They are further supported by other components such as a response ranking subsystem and a vector index. The offering is for all Databricks SQL Pro and Serverless customers, with Dashboards being generally available and Genie in public preview starting today.
5. Databricks LakeFlow for simplified data engineering
In addition to AI/BI, Databricks also debuted LakeFlow, a unified experience built atop its Data Intelligence Platform to unify and simplify all aspects of data engineering, from data ingestion to transformation and orchestration.
While building and maintaining data pipelines has long been a task of complex tools and integration, LakeFlow solves it for good. The offering ingests data from different sources and then automates pipeline deployment, operation and monitoring with built-in support for CI/CD and quality checks at scale.
It is yet to enter preview, although Databricks has opened a waitlist where users can sign up for early access.
6. Partnerships with Nvidia and Gretel
Finally, Databricks announced major partnerships with Nvidia and Gretel.
The partnership with Nvidia focuses on adding native support for CUDA-accelerated computing in Databricks’ next-generation vectorized query engine, Photon, to deliver improved speed and efficiency when handling data warehousing and analytics workloads. Meanwhile, the engagement with Gretel makes the company an ISV technology partner providing high-quality synthetic datasets to build and customize machine learning models on Databricks’ platform.
Source