Visit: [ Ссылка ] for all sessions
Realizing the Vision of the Data Lakehouse
Ali Ghodsi
Data warehouses have a long history in decision support and business intelligence applications. But, data warehouses were not well suited to dealing with the unstructured, semi-structured, and streaming data common in modern enterprises. This led to organizations building data lakes of raw data about a decade ago. But, they also lacked important capabilities. The need for a better solution has given rise to the data lakehouse, which implements similar data structures and data management features to those in a data warehouse, directly on the kind of low cost storage used for data lakes.
This keynote by Databricks CEO, Ali Ghodsi, explains why the open source Delta Lake project takes the industry closer to realizing the full potential of the data lakehouse, including new capabilities within the Databricks Unified Data Analytics platform to significantly accelerate performance. In addition, Ali will announce new open source capabilities to collaboratively run SQL queries against your data lake, build live dashboards, and alert on important changes to make it easier for all data teams to analyze and understand their data.
Introducing Apache Spark 3.0:
A retrospective of the Last 10 Years, and a Look Forward to the Next 10 Years to Come.
Matei Zaharia and Brooke Wenig
In this keynote from Matei Zaharia, the original creator of Apache Spark, we will highlight major community developments with the release of Apache Spark 3.0 to make Spark easier to use, faster, and compatible with more data sources and runtime environments. Apache Spark 3.0 continues the project’s original goal to make data processing more accessible through major improvements to the SQL and Python APIs and automatic tuning and optimization features to minimize manual configuration. This year is also the 10-year anniversary of Spark’s initial open source release, and we’ll reflect on how the project and its user base has grown, as well as how the ecosystem around Spark (e.g. Koalas, Delta Lake and visualization tools) is evolving to make large-scale data processing simpler and more powerful.
Delta Engine: High Performance Query Engine for Delta Lake
Reynold Xin
How Starbucks is Achieving its ‘Enterprise Data Mission’ to Enable Data and ML at Scale and Provide World-Class Customer Experiences
Vish Subramanian
Starbucks makes sure that everything we do is through the lens of humanity – from our commitment to the highest quality coffee in the world, to the way we engage with our customers and communities to do business responsibly. A key aspect to ensuring those world-class customer experiences is data. This talk highlights the Enterprise Data Analytics mission at Starbucks that helps making decisions powered by data at tremendous scale. This includes everything ranging from processing data at petabyte scale with governed processes, deploying platforms at the speed-of-business and enabling ML across the enterprise. This session will detail how Starbucks has built world-class Enterprise data platforms to drive world-class customer experiences. Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. [ Ссылка ]
Ещё видео!