StarTree broadly enhances Apache Pinot-based analytics platform
StarTree Inc., a startup commercializing the open-source Apache Pinot real-time data analytics platform, is adding new observability and anomaly detection capabilities to its managed service along with vector search functionality for Pinot.
Apache Pinot is an open-source, distributed, real-time analytics engine originally developed by LinkedIn Inc. It’s designed to handle large-scale, low-latency data ingestion and querying and can execute complex analytical queries in real time. It’s highly regarded for efficiency and speed.
StarTree ThirdEye, a real-time anomaly detection and root cause analysis application, is now generally available. It provides automated, multi-dimensional metric monitoring to identify the root causes of problems, a critical function for businesses like delivery and ride-sharing services that must correlate large volumes of real-time data.
ThirdEye uses statistical algorithms to learn how data flows and adjusts thresholds automatically, said Head of Product Chinmay Soman. “Traditional solutions have static thresholds; they don’t learn external factors like seasonality,” he said. “ThirdEye can follow patterns and learn them over time to generate alerts.”
The system also fires off many queries to narrow down the dimensions that can cause metrics to change. “Many times, people don’t know what to monitor,” Soman said. “We help with previewing the time-series data metrics that may affect the business.”
The company is offering a private preview of a write application program interface that can be used to integrate with external applications such as the Debezium change data capture platform, Fivetran Inc.’s automated data movement application, or dbt Labs Inc.’s data development platform.
“The write API is aimed at making life easier for developers,” Soman said. “The only way to get data into StarTree used to be through batch loading. That’s a friction point for developers who don’t use [Apache] Kafka. They can now insert a query and load via the API.” Kafka is a popular stream-processing platform.
Limited observability
The company is also showing a private preview of an observability feature that provides query support for metrics, logs and traces. “Observability is a big pain point” for companies doing real-time analytics, Soman said.
StarTree doesn’t aim to compete with full-stack observability players such as Datadog Inc. and New Relic Inc. “For companies that want an all-in solution, this may not be the right starting point, but for companies that are opinionated about what agents run on their platforms, they need a disaggregated solution,” Soman said. “We think it’s a differentiated approach because of our ability to scale, the speed of queries and cost efficiency. We can store in the cloud with sub-second latencies.”
The new vector indexing feature makes StarTree an option for large language model training, which relies heavily on vector databases. The first implementation supports Hierarchical Navigable Small Worlds graphs, a data structure and algorithm used for approximate nearest neighbor search in high-dimensional spaces that provides high performance for vector similarity search.
A new visualization integration is now available for Tableau Software LLC’s namesake business intelligence platform and in private preview for the open-source Grafana visualization engine.
Finally, StarTree Cloud is adding a “free forever” tier that offers access to its full suite of features, including low-code data ingestion and an enhanced query console in a serverless cloud environment.
Image: xresch/Pixabay
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU