FairScale is a PyTorch extension library for high performance and large scale training. FairScale makes available the latest distributed training techniques in the form of composable modules and easy to use APIs.
- Optimizer, Gradient and Model Sharding
- Efficient memory usage using Activation Checkpointing
- Scale your model on a single GPU using OffloadModel
- Scale without modifying learning rate using Adascale
- Model sharding using Pipeline Parallel
- Tooling to diagnose and fix memory problems
- Efficient Data Parallel Training with SlowMo Distributed Data Parallel