Apache Druid is an open-source columnar database known for high performance at scale; its largest deployments comprise thousands of servers. But no matter the scale, high performance starts with good fundamentals. This talk will dive into those fundamentals by exploring the inner workings of a single data server.
We’ll cover how Apache Druid stores data, what kinds of compression it uses, how it indexes data, how the storage engine is linked with the query processing engine, and how the system handles resource management and multithreading. Together, all these pieces enable Apache Druid to process billions of records per second on a single data server.
Imply is a real-time data platform for cost-effective, low-latency analytics. Uniquely, it provides consistent sub-second response to ad hoc queries against PB-scale data, even with high user concurrency. Imply is used for clickstream analytics, application, network and service performance monitoring, IoT analytics, fraud detection and more. Imply powers user-facing analytics applications and serves as a backend for highly-concurrent APIs. Companies such as Twitter, Charter (Spectrum), Twitch and DBS (Southeast Asia’s largest bank) trust Imply to put analytics into the hands of their trained analysts and non-technical business people.
Connect
Website: [ Ссылка ]
Linkedin: [ Ссылка ]
Twitter: [ Ссылка ]
Github: [ Ссылка ]
Slideshare: [ Ссылка ]
Inside Apache Druid’s storage and query engine
Теги
next data analystsnext data analyticsdata analytics druiddata gcpdruid query engineimply biapachedruidapache druiddruidiobi toolsbusiness intelligenceapache druid business intelligenceinteractive dashboarddata dashboardresponsive dashboardreal time dataimply real time dataimply cloudimply cloud 20apache druid 2020new cloud releasesapache druid query engineinside apache druid query enginedruid storage and query engine