As the Apache Spark userbase grows, the developer community is working to adapt it for ever-wider use cases. 2014 saw fast adoption of Spark in the enterprise and major improvements in its performance, scalability and standard libraries. In 2015, we want to make Spark accessible to a wider set of users, through new high-level APIs for data science: machine learning pipelines, data frames, and R language bindings. In addition, we are defining extension points to let Spark grow as a platform, making it easy to plug in data sources, algorithms, and external packages. Like all work on Spark, these APIs are designed to plug seamlessly into Spark applications, giving users a unified platform for streaming, batch and interactive data processing.
Watch more from Strata + Hadoop San Jose 2015: [ Ссылка ]
Visit the conference website to learn more: [ Ссылка ]
Subscribe to O’Reilly on YouTube! [ Ссылка ]
Stay Connected to O'Reilly Media by Email - [ Ссылка ]
Follow O'Reilly Media:
[ Ссылка ]
[ Ссылка ]
[ Ссылка ]
About Matei Zaharia (Databricks):
Matei Zaharia started the Spark project at UC Berkeley and is currently CTO of Databricks. He serves as Spark’s vice president at Apache. In spring 2015, he is also beginning an assistant professor position at MIT.
Ещё видео!