Another start to the month, another Hugging Face Accelerate release! We've been cooking, let's talk about how 👨🍳 * There's a new profiler in town that can help you collect performance metrics during training and inference, and you can then visualize it in tools like Chrome. Check out the new docs (linked below) for more info! * Thanks to Stas Bekman we were able to track down, identify, and fix a slowdown during `import accelerate` that helped reduce our import times by over 60%! We've taken steps to ensure such slowdowns can't go unnoticed again, and we hope you enjoy being able to accelerate a bit faster! * We've added support for more complex PyTorch DDP communication hooks, allowing you to customize how gradients are communicated across workers. (New docs linked below) * With XPU now in native PyTorch, we've made sure to upstream the support so you can continue right away with using the native implementation (note, this requires PyTorch >= 2.4) * And so, so much more Enjoy the release!
To read the full release notes, check them out here: https://github.com/huggingface/accelerate/releases/tag/v0.32.0
Does the library support simultaneous tensor parallelism and pipeline parallelism configurations?
🙌🤗🤗
Technical Lead for Accelerate at HuggingFace
1moTo learn more about the new profiler, please see the docs here: https://huggingface.co/docs/accelerate/usage_guides/profiler