A Survey on Distributed Machine Learning 1
Horovod: fast and easy distributed deep learning in TensorFlow 2
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism 3
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 4
https://dl.acm.org/doi/abs/10.1145/3377454
https://arxiv.org/abs/1802.05799
https://arxiv.org/abs/1811.06965
https://arxiv.org/abs/1706.02677