USENIX ATC '21 - ZeRO-Offload: Democratizing Billion-Scale Model Training