Alexander Borzunov
|
e8fac92e59
Allow .generate() to reuse existing inference session (#132)
|
il y a 2 ans |
Alexander Borzunov
|
1fe3716589
Don't ban servers in case of client-caused handler errors (#134)
|
il y a 2 ans |
Alexander Borzunov
|
66f1799d32
Set default --step_timeout to 5 min (#133)
|
il y a 2 ans |
Alexander Borzunov
|
b873d92ffa
Update README.md
|
il y a 2 ans |
Alexander Borzunov
|
5d5d2666b8
Mention parallel inference
|
il y a 2 ans |
Alexander Borzunov
|
955eae30b3
Mention 1 sec/token explicitly
|
il y a 2 ans |
Alexander Borzunov
|
33c210b973
Update Colab notebook
|
il y a 2 ans |
Alexander Borzunov
|
f56edaa13f
Fix inference and rpc_info() fault tolerance (#131)
|
il y a 2 ans |
justheuristic
|
79a4308992
Clear trigger before engaging in update (#130)
|
il y a 2 ans |
Alexander Borzunov
|
b8e1c1b7f5
Revert to hivemind==1.1.3 for stability (#129)
|
il y a 2 ans |
justheuristic
|
68c85e7492
Avoid synchronous updates, ban peers based on request outcome (#127)
|
il y a 2 ans |
Alexander Borzunov
|
9dbf5e2e6f
Set dht.num_workers = n_layer, update_period = 150, expiration = 300 (#125)
|
il y a 2 ans |
Max Ryabinin
|
3ca8b4f082
Fix typos with codespell (#126)
|
il y a 2 ans |
justheuristic
|
8491ed2bd3
Add checks for forward() inputs on the client side (#123)
|
il y a 2 ans |
Max Ryabinin
|
055f85b83e
Call block.load_state_dict only once (#124)
|
il y a 2 ans |
Artem Chumachenko
|
0855aa7347
Update notebooks to use full BLOOM-176B (#104)
|
il y a 2 ans |
Max Ryabinin
|
4ffb4d83c7
Remove "-r" when installing Petals in examples (#122)
|
il y a 2 ans |
Alexander Borzunov
|
d29ef70c85
Update README.md
|
il y a 2 ans |
Alexander Borzunov
|
1d9aa77697
Update README.md
|
il y a 2 ans |
Alexander Borzunov
|
da36470a4b
Update README.md
|
il y a 2 ans |
Alexander Borzunov
|
81b94df14b
Rework readme, move code example to the top, link draft of Colab (#118)
|
il y a 2 ans |
Alexander Borzunov
|
893987ebf8
Require hivemind==1.1.4 with p2pd v0.3.13 (#121)
|
il y a 2 ans |
Alexander Borzunov
|
fc6722576b
Choose --num_blocks for bigscience/bloom-petals automatically (#119)
|
il y a 2 ans |
Alexander Borzunov
|
f72c220404
Suppress quantization warning and fix dtype defaults in compute benchmark (#117)
|
il y a 2 ans |
Alexander Borzunov
|
643a054170
Make server use smart defaults (#115)
|
il y a 2 ans |
justheuristic
|
9e11f73242
Fix tile size on ampere (#116)
|
il y a 2 ans |
justheuristic
|
617d70f7dc
Support --load_in_8bit on pre-Turing GPUs (#113)
|
il y a 2 ans |
Alexander Borzunov
|
1ea44b0d3c
Measure throughput for different configs, devices, and dtypes separately (#114)
|
il y a 2 ans |
justheuristic
|
01838f9a99
Fix Linear8bitlt state config, update tests (#112)
|
il y a 2 ans |
Aleksandr Borzunov
|
96033de921
Fix script for running servers robustly
|
il y a 2 ans |