Aleksandr Borzunov
|
35c324a1f5
Require hivemind>=1.1.4 with p2pd v0.3.13
|
2 år sedan |
Alexander Borzunov
|
fc6722576b
Choose --num_blocks for bigscience/bloom-petals automatically (#119)
|
2 år sedan |
Alexander Borzunov
|
f72c220404
Suppress quantization warning and fix dtype defaults in compute benchmark (#117)
|
2 år sedan |
Alexander Borzunov
|
643a054170
Make server use smart defaults (#115)
|
2 år sedan |
justheuristic
|
9e11f73242
Fix tile size on ampere (#116)
|
2 år sedan |
justheuristic
|
617d70f7dc
Support --load_in_8bit on pre-Turing GPUs (#113)
|
2 år sedan |
Alexander Borzunov
|
1ea44b0d3c
Measure throughput for different configs, devices, and dtypes separately (#114)
|
2 år sedan |
justheuristic
|
01838f9a99
Fix Linear8bitlt state config, update tests (#112)
|
2 år sedan |
Aleksandr Borzunov
|
96033de921
Fix script for running servers robustly
|
2 år sedan |
Aleksandr Borzunov
|
85cf32d2a4
Add script to run servers robustly
|
2 år sedan |
justheuristic
|
088713912d
Patch Linear8bit to enable CxB backward (#111)
|
2 år sedan |
justheuristic
|
8dc0f513ba
Hotfix span selection (#110)
|
2 år sedan |
justheuristic
|
a2066a4096
Optimize RemoteSequenceManager (#106)
|
2 år sedan |
Artem Chumachenko
|
7d859a947b
Expose request_timeout to DistributedBloomConfig (#105)
|
2 år sedan |
Max Ryabinin
|
9faf08b898
Remove unused imports, add missing arguments to docstrings (#108)
|
2 år sedan |
justheuristic
|
b3115dac58
Update throughput.py
|
2 år sedan |
Max Ryabinin
|
5c2990e1b5
Add Dockerfile (#82)
|
2 år sedan |
Alexander Borzunov
|
0a1cd3b9ba
Fix ptune with `low_cpu_mem_usage=True` (as in Colab) (#103)
|
2 år sedan |
Alexander Borzunov
|
43ac6016ac
Fix dtypes in backend schemas (#99)
|
2 år sedan |
Alexander Borzunov
|
7bd5916744
Make Petals a pip-installable package (attempt 2) (#102)
|
2 år sedan |
Alexander Borzunov
|
0c3781a89c
Shorten bullet points in readme
|
2 år sedan |
Alexander Borzunov
|
ab41223b17
Fix dtype- and device-related client issues (#98)
|
2 år sedan |
Alexander Borzunov
|
c6e1b5a8e5
Add various server timeouts, lower --max_batch_size and --inference_max_length defaults (#97)
|
2 år sedan |
Alexander Borzunov
|
d8ef09146e
Improve server's logging (#96)
|
2 år sedan |
Artem Chumachenko
|
fdb3583a8c
Add Beam Search decoding algorithm (#87)
|
2 år sedan |
Alexander Borzunov
|
fef7257fe0
Try to fix protobuf versions once again (#95)
|
2 år sedan |
Aleksandr Borzunov
|
1b51703444
Revert protobuf version change
|
2 år sedan |
Alexander Borzunov
|
b26b0b7121
Require hivemind with fixed compression and protobuf working on Colab (#94)
|
2 år sedan |
Alexander Borzunov
|
8a73b41a42
Make ServerState announcements work better (#93)
|
2 år sedan |
Alexander Borzunov
|
dc71574a63
Use public swarm by default (#92)
|
2 år sedan |