浏览代码

Update hivemind/optim/experimental/optimizer.py

Co-authored-by: Alexander Borzunov <hxrussia@gmail.com>
justheuristic 3 年之前
父节点
当前提交
5a8fd7d8f3
共有 1 个文件被更改,包括 1 次插入1 次删除
  1. 1 1
      hivemind/optim/experimental/optimizer.py

+ 1 - 1
hivemind/optim/experimental/optimizer.py

@@ -48,7 +48,7 @@ class Optimizer(torch.optim.Optimizer):
     - after accumulating the target batch size, all-reduce gradients with peers and perform optimizer step,
     - after accumulating the target batch size, all-reduce gradients with peers and perform optimizer step,
     - if, for any reason, your peer lags behind the rest of the swarm, it will load state from up-to-date peers.
     - if, for any reason, your peer lags behind the rest of the swarm, it will load state from up-to-date peers.
 
 
-    :note: Hivemind.Optimizer can be used the same way any other pytorch optimizer, but there is one limitation:
+    :note: hivemind.Optimizer can be used the same way any other pytorch optimizer, but there is one limitation:
       learning rate schedulers, curriculum and other time-dependent features should use opt.global_step (and not the
       learning rate schedulers, curriculum and other time-dependent features should use opt.global_step (and not the
       number of local forward-backward cycles). This is because any device can join midway through training, when
       number of local forward-backward cycles). This is because any device can join midway through training, when
       other peers have already made some progress and changed their learning rate accordingly.
       other peers have already made some progress and changed their learning rate accordingly.