Procházet zdrojové kódy

Quickstart: typos and references (#420)

- fix minor typo in Optimizer docstring
- refer to Optimizer docs from quickstart.md
justheuristic před 3 roky
rodič
revize
c328698b86

+ 1 - 0
docs/user/quickstart.md

@@ -183,6 +183,7 @@ we show how to use a more advanced version of DecentralizedOptimizer to collabor
 
 If you want to learn more about each individual component,
 - Learn how to use `hivemind.DHT` using this basic [DHT tutorial](https://learning-at-home.readthedocs.io/en/latest/user/dht.html),
+- Read more on how to use `hivemind.Optimizer` in its [documentation page](https://learning-at-home.readthedocs.io/en/latest/modules/optim.html), 
 - Learn the underlying math behind hivemind.Optimizer in [Diskin et al., (2021)](https://arxiv.org/abs/2106.10207), 
   [Li et al. (2020)](https://arxiv.org/abs/2005.00124) and [Ryabinin et al. (2021)](https://arxiv.org/abs/2103.03239).
 - Read about setting up Mixture-of-Experts training in [this guide](https://learning-at-home.readthedocs.io/en/latest/user/moe.html),

+ 1 - 1
hivemind/optim/experimental/optimizer.py

@@ -64,7 +64,7 @@ class Optimizer(torch.optim.Optimizer):
       Like in PyTorch LR Scheduler, **epoch does not necessarily correspond to a full pass over the training data.**
       At the end of epoch, peers perform synchronous actions such as averaging gradients for a global optimizer update,
       updating the learning rate scheduler or simply averaging parameters (if using local updates).
-      The purpose of this is to ensure that changing the number of peers does not reqire changing hyperparameters.
+      The purpose of this is to ensure that changing the number of peers does not require changing hyperparameters.
       For instance, if the number of peers doubles, they will run all-reduce more frequently to adjust for faster training.
 
     :Configuration guide: This guide will help you set up your first collaborative training run. It covers the most