Artem Chumachenko
|
568f21dc3b
Add customizable input tensors (#445)
|
2 gadi atpakaļ |
Alexander Borzunov
|
8c546d988a
Test Llama, rebalancing, throughput eval, and all CLI scripts (#452)
|
2 gadi atpakaļ |
Alexander Borzunov
|
11f0d992d7
Report inference, forward, and network RPS separately (#358)
|
2 gadi atpakaļ |
Alexander Borzunov
|
1a78638c02
Test that bitsandbytes is not imported when it's not used (#351)
|
2 gadi atpakaļ |
Alexander Borzunov
|
de930918a0
Support loading blocks in 4-bit (QLoRA NF4 format, disabled by default) (#333)
|
2 gadi atpakaļ |
Alexander Borzunov
|
cb3f018f9f
Add LLaMA support (#323)
|
2 gadi atpakaļ |
Max Ryabinin
|
793726b041
Speed up loading blocks using init with meta weights (#285)
|
2 gadi atpakaļ |
Alexander Borzunov
|
702bb5a2c2
CI: Update deprecated actions, don't measure network RPS (#215)
|
2 gadi atpakaļ |
justheuristic
|
ae9e71fe8e
Add local tensor-parallel fwd/bwd (#143)
|
2 gadi atpakaļ |
justheuristic
|
91898c3c90
Switch to speedtest-cli (#157)
|
2 gadi atpakaļ |
justheuristic
|
b04982c1a2
Bump transformers to 4.25.1 (#151)
|
2 gadi atpakaļ |