Holden Karau, OSS Engineer, Data Platform Engineering, talks about the importance of reliable data pipelines and how to build them covering tools from testing to validation and auditing. The talk uses Apache Spark as an example, but the concepts generalize regardless of your specific tools.
Some related projects include:
[ Ссылка ]
[ Ссылка ]
[ Ссылка ]
and
[ Ссылка ].
#netflix
#datascience
#dataengineering
#etl
#bigdata
Ещё видео!