Optimizing Speed and Scale of Real-Time Analytics Using Apache Pulsar and Apache Pinot - Mary Grygleski, DataStax
Apache Pulsar is a new generation of platform that offers enterprise-grade event streaming and processing capabilities built for today's Cloud Native environment. But what do you do if you want to perform user-facing, ad-hoc, real-time analytics too? That's where Apache Pinot comes in. Apache Pinot is a realtime distributed OLAP datastore, which is used to deliver scalable realtime analytics with low latency. It can ingest data from batch data sources (S3, HDFS, Azure Data Lake, Google Cloud) as well as streaming sources such as Pulsar. Pinot is used extensively at LinkedIn and Uber to power many analytical applications serving 250k+ queries per second while ingesting 1Million+ events per second. Apache Pulsar's highly performant, distributed, fault-tolerant, real-time publish-subscribe as well as queueing messaging platform that operates seamlessly in a Cloud-Native environment with support for geo-replication, multi-tenancy, data warehouse or data lake integrations, and beyond. It is a tried-and-true platform that has major enterprise customers such as Yahoo, Verizon, GM, Comcast, etc. Apache Pulsar and Apache Pinot together represents a blissful union in the #OSS "heaven"!
Ещё видео!