Alexander Borzunov
|
f3fafd14a4
Bump version to 2.0.1 (#411)
|
2 vuotta sitten |
Alexander Borzunov
|
fd19c21859
Update --update_period and --expiration defaults (#410)
|
2 vuotta sitten |
Alexander Borzunov
|
ffb20b585c
Update commands for hosting Llama 2 in readme (#409)
|
2 vuotta sitten |
Alexander Borzunov
|
48c6b6d963
Update README.md (#407)
|
2 vuotta sitten |
Alexander Borzunov
|
c153cba1fa
Add Llama 2, WSL instructions to readme (#406)
|
2 vuotta sitten |
justheuristic
|
5af04524dd
Split long sequences into chunks (#403)
|
2 vuotta sitten |
Alexander Borzunov
|
30b94ef18b
If speedtest fails, assume network speed of 100 Mbit/s (#404)
|
2 vuotta sitten |
Alexander Borzunov
|
8666653cf5
Fix routing through relay, default network RPS, --token, logging, readme (#399)
|
2 vuotta sitten |
Alexander Borzunov
|
eb0664b993
Support Python 3.11 (#393)
|
2 vuotta sitten |
Alexander Borzunov
|
6e4ebb94d2
Fix deadlocks in MemoryCache (#396)
|
2 vuotta sitten |
Alexander Borzunov
|
b6b3ae964f
Fix --attn_cache_tokens default (#392)
|
2 vuotta sitten |
Alexander Borzunov
|
d49d9ad0cf
Bump version to 2.0.0.post3 (#391)
|
2 vuotta sitten |
justheuristic
|
e51e84631d
Update to petals.dev (#390)
|
2 vuotta sitten |
Aleksandr Borzunov
|
ddcda02b06
Hardcode IPs until DNS issues get resolved
|
2 vuotta sitten |
Alexander Borzunov
|
b1ff8bdd6c
Bump version to 2.0.0.post1 (#384)
|
2 vuotta sitten |
Alexander Borzunov
|
e9a20e7e53
Require accelerate>=0.20.3 as transformers do (#383)
|
2 vuotta sitten |
Alexander Borzunov
|
057a2fb5de
Support Llama 2 (#379)
|
2 vuotta sitten |
Alexander Borzunov
|
3218534745
Fix --token arg (#378)
|
2 vuotta sitten |
justheuristic
|
398a384075
Inherit bitsandbytes compute dtype correctly (override peft quirk) (#377)
|
2 vuotta sitten |
justheuristic
|
5a8de2f1f8
Fix handler memory leak, get rid of mp.Manager (#373)
|
2 vuotta sitten |
Alexander Borzunov
|
895327a0ae
Fix readme code example, require Python < 3.11 until supported (#374)
|
2 vuotta sitten |
Alexander Borzunov
|
c735dd7ba3
Update transformers to 4.31.0 and peft to 0.4.0 (#371)
|
2 vuotta sitten |
justheuristic
|
1ab35c2826
Typo in inference_session.py
|
2 vuotta sitten |
Alexander Borzunov
|
a6fdfc0556
Fix AssertionError on rebalancing (#370)
|
2 vuotta sitten |
Alexander Borzunov
|
f97582fb5f
Require transformers < 4.31.0 until we're compatible (#369)
|
2 vuotta sitten |
Alexander Borzunov
|
3b300c32e4
Update readme to show new models (#365)
|
2 vuotta sitten |
Alexander Borzunov
|
62d9ed5ce7
Implement shortest-path routing for inference (#362)
|
2 vuotta sitten |
Ikko Eltociear Ashimine
|
fd30f7ce10
Fix typo in generation_algorithms.py (#364)
|
2 vuotta sitten |
Alexander Borzunov
|
11f0d992d7
Report inference, forward, and network RPS separately (#358)
|
2 vuotta sitten |
Alexander Borzunov
|
9517dd1e3d
Update readme and "Getting started" link (#360)
|
2 vuotta sitten |