Michael Diskin
|
86f3c0dd0d
Support auxiliary peers in CollaborativeOptimizer (#279)
|
4 lat temu |
Aleksandr Borzunov
|
020c068344
Log collaboration step to Wandb, store metrics only if peer is synchronized (#267)
|
4 lat temu |
Aleksandr Borzunov
|
9bb775fe04
Log correct loss in examples/albert/run_first_peer.py (#265)
|
4 lat temu |
Alexey Bukhtiyarov
|
01103cf991
Add state checkpointing and uploading in coordinator (#241)
|
4 lat temu |
Aleksandr Borzunov
|
3bde6188fe
Protect training progress and metrics with signatures and DHT schema validation (#250)
|
4 lat temu |
Michael Diskin
|
2314e7ebd5
fix metrics (#240)
|
4 lat temu |
Alexey Bukhtiyarov
|
27ea94e3f9
Add example for collaborative ALBERT training (#226)
|
4 lat temu |