justheuristic
|
94b9db0d37
Fix random freezes in averager.step, improve error handling (#254)
|
4 years ago |
Alexey Bukhtiyarov
|
01103cf991
Add state checkpointing and uploading in coordinator (#241)
|
4 years ago |
Aleksandr Borzunov
|
3bde6188fe
Protect training progress and metrics with signatures and DHT schema validation (#250)
|
4 years ago |
Aleksandr Borzunov
|
08ee017f0f
Add nltk to ALBERT example's requirements (#251)
|
4 years ago |
Roman Zhytar
|
e833a7efb9
Decentralized adaptive optimizers (#243)
|
4 years ago |
Aleksandr Borzunov
|
18add2c04b
Implement combining validators (#249)
|
4 years ago |
Max Ryabinin
|
0a1fdb172f
Fix incorrect data types/values in RemoteSwitchMixtureOfExperts (#246)
|
4 years ago |
Max Ryabinin
|
dfbc401196
Add Dockerfile, refactor tests (#245)
|
4 years ago |
justheuristic
|
ddb5389e66
Fix server hanging in certain cases when connection is lost (#247)
|
4 years ago |
Aleksandr Borzunov
|
a3feafa907
Add DHT schema validator (#227)
|
4 years ago |
Michael Diskin
|
2314e7ebd5
fix metrics (#240)
|
4 years ago |
Alexey Bukhtiyarov
|
27ea94e3f9
Add example for collaborative ALBERT training (#226)
|
4 years ago |
Max Ryabinin
|
62652e1717
Add Switch Transformers-like RemoteMixtureOfExperts (#228)
|
4 years ago |
justheuristic
|
3d6a242e30
Ensure version-consistent result rounding in load_balance_peers (#230)
|
4 years ago |
Roman Zhytar
|
8c3bd93e87
Statistics averaging (#229)
|
4 years ago |
Vsevolod-pl
|
91d17a4ebc
Delta gradients transmission (#225)
|
4 years ago |
romakail
|
ca5c7610ae
Add tool for custom user experts (#189)
|
4 years ago |
justheuristic
|
32b87bf3fe
Reset gradient buffers when synchronizing with peers (#222)
|
4 years ago |
justheuristic
|
b906ae94ed
better zero_grad behavior in CollaborativeOptimizer (#221)
|
4 years ago |
justheuristic
|
2359906253
Add gradient buffers to CollaborativeOptimizer (#220)
|
4 years ago |
mponty
|
0080028e25
Add uniform compression (#202)
|
4 years ago |
ploshkin
|
9d2a40714c
Prevent DecentralizedSGD from accidentally skipping a fraction of training batches (#218)
|
4 years ago |
Max Ryabinin
|
916c3db52d
Move compression-related code to hivemind.utils.compression (#213)
|
4 years ago |
Alexey Bukhtiyarov
|
7bb6565674
Add CollaborativeOptimizer, TrainingAverager (#215)
|
4 years ago |
justheuristic
|
053c7c7d13
Disentangle DecentralizedAverager components, add weights (#217)
|
4 years ago |
Max Ryabinin
|
ca6d87a837
Replace FeedforwardBlock with a correct implementation (#211)
|
4 years ago |
Max Ryabinin
|
1d364b7c32
Convert SerializerBase to an abstract class (#212)
|
4 years ago |
Max Ryabinin
|
6128cbbd51
Add gradient clipping support to ExpertBackend (#214)
|
4 years ago |
justheuristic
|
ca3aadb8f8
Implement basic decentralized optimizers (#210)
|
4 years ago |
Max Ryabinin
|
6f8f192150
Improve Runtime exception handling (#207)
|
4 years ago |