AIForce/petals @ server-timeouts

Aleksandr Borzunov 5578378202 Lower --max_batch_size and --inference_max_length defaults to 2048		3 lat temu
..
__init__.py	05faa0b3c8 add quantization script for cpu	3 lat temu
config.json	a798ea04a6 add minimalistic benchmarks	3 lat temu
convert_model.py	a2634001e9 Reduce vocabulary size in test model, fix bug in routing when overlapped (#45)	3 lat temu
deploy_server.sh	11a424837f integrate mixed-8bit model (#39)	3 lat temu
inference_one_block.py	4695071ad2 WIP: make DistributedBloom compliant with HF interface	3 lat temu
local_server_config_example.cfg	f60a7dd183 deploy swarm on local & remote machines	3 lat temu
remote_server_config_example.cfg	f60a7dd183 deploy swarm on local & remote machines	3 lat temu
run_local_servers.sh	11a424837f integrate mixed-8bit model (#39)	3 lat temu
run_remote_servers.sh	6573076883 Sequential and parallel forward / backward (#36)	3 lat temu
run_server.py	5578378202 Lower --max_batch_size and --inference_max_length defaults to 2048	3 lat temu
speed_test.py	e2711a033b Add automated tests (#23)	3 lat temu