|
@@ -65,8 +65,8 @@ Basic tutorials:
|
|
|
|
|
|
Useful tools and advanced guides:
|
|
|
|
|
|
-- [Chatbot web app](http://chat.petals.ml) (connects to Petals via an HTTP/WebSocket endpoint): [source code](https://github.com/borzunov/chat.petals.ml)
|
|
|
-- [Monitor](http://health.petals.ml) for the public swarm: [source code](https://github.com/borzunov/health.petals.ml)
|
|
|
+- [Chatbot web app](https://chat.petals.dev) (connects to Petals via an HTTP/WebSocket endpoint): [source code](https://github.com/borzunov/chat.petals.dev)
|
|
|
+- [Monitor](https://health.petals.dev) for the public swarm: [source code](https://github.com/borzunov/health.petals.dev)
|
|
|
- Launch your own swarm: [guide](https://github.com/bigscience-workshop/petals/wiki/Launch-your-own-swarm)
|
|
|
- Run a custom foundation model: [guide](https://github.com/bigscience-workshop/petals/wiki/Run-a-custom-model-with-Petals)
|
|
|
|
|
@@ -78,7 +78,7 @@ Learning more:
|
|
|
## How does it work?
|
|
|
|
|
|
- Petals runs large language models like [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) and [BLOOM](https://huggingface.co/bigscience/bloom) **collaboratively** — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning.
|
|
|
-- Single-batch inference runs at up to 6 steps/sec for LLaMA 2 (70B) and ≈ 1 step/sec for BLOOM-176B. This is [up to 10x faster](https://github.com/bigscience-workshop/petals#benchmarks) than offloading, enough for [chatbots](http://chat.petals.ml) and other interactive apps. Parallel inference reaches hundreds of tokens/sec.
|
|
|
+- Single-batch inference runs at up to 6 steps/sec for LLaMA 2 (70B) and ≈ 1 step/sec for BLOOM-176B. This is [up to 10x faster](https://github.com/bigscience-workshop/petals#benchmarks) than offloading, enough for [chatbots](https://chat.petals.dev) and other interactive apps. Parallel inference reaches hundreds of tokens/sec.
|
|
|
- Beyond classic language model APIs — you can employ any fine-tuning and sampling methods, execute custom paths through the model, or see its hidden states. You get the comforts of an API with the flexibility of PyTorch.
|
|
|
|
|
|
<p align="center">
|
|
@@ -218,5 +218,5 @@ _arXiv preprint arXiv:2209.01188,_ 2022.
|
|
|
This project is a part of the <a href="https://bigscience.huggingface.co/">BigScience</a> research workshop.
|
|
|
</p>
|
|
|
<p align="center">
|
|
|
- <img src="https://petals.ml/bigscience.png" width="150">
|
|
|
+ <img src="https://petals.dev/bigscience.png" width="150">
|
|
|
</p>
|