Whitepaper: [ Ссылка ]
Learn, from start to finish, how to build a GPU cluster for deep learning. We'll cover the entire process, including cluster level design, rack level design, node level design, CPU and GPU selection, power distribution, storage, and networking.
This talk is based on the Lambda Echelon GPU Cluster whitepaper. The whitepaper can be found above.
Slides for the talk can be found here:
[ Ссылка ]
Errata:
- Slide 46 contains an erroneous diagram with a connection from the storage server to the compute fabric network, the storage server does not connect ot the compute fabric network. The correct diagram is available in the whitepaper.
Ещё видео!