Many companies are migrating their data warehouses from traditional RDBMS to BigData, and, in particular to Apache Spark. This usually requires a lot of effort and time: most of the developers used to work with RDBMS, in fact, need to quickly ramp-up in all big-data technologies in order to achieve the goal. Having faced this problem multiple times, at DBS Bank, we implemented a Spark-based application which helps during this migration process. The application embeds the Spark engine and offers a web UI to allow users to create, run, test and deploy jobs interactively. Jobs are primarily written in native SparkSQL, or other flavours of SQL (i.e. TDSQL).
In the latter case an intermediate layer translates vendor-specific SQL constructs into Dataset operations (whenever possible) in order to leverage the features of the Catalyst engine. To offer RDBMS-like operations, the software is integrated with CarbonData as a storage layer, allowing users to perform update or delete operations on data. Among other things, the UI offers the possibility of validating procedures and performing data comparisons tasks between different datasets. To simplify deployment, each job can be packaged and released individually. The software produces a metadata file which is capable of driving the execution of the same transformations defined in the UI, in a batch fashion to be run in a production environment.
During the talk we will showcase all the above features and explain how each one of them are helping ETL developers to migrate traditional RDBMS SQL code to Spark in DBS Bank.
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: [ Ссылка ]
Connect with us:
Website: [ Ссылка ]
Facebook: [ Ссылка ]
Twitter: [ Ссылка ]
LinkedIn: [ Ссылка ]
Instagram: [ Ссылка ] Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. [ Ссылка ]
Ещё видео!