AIForce/hivemind @ 3c964b5c4f5209475f77b884c6d7bf1cf6d47909

Bez popisu

UARTman 3c964b5c4f Merge pull request #16 from learning-at-home/update_readme-restyled		před 5 roky
.circleci	6a4a7e8831 Change deps - testing	před 5 roky
tesseract	fb4ef759ff Initial commit	před 5 roky
tests	0adadf649f add not failed assertion	před 5 roky
.gitignore	c43fabcddb Add .gitignore	před 5 roky
CONTRIBUTING.md	18bca6731e explicit standards	před 5 roky
LICENSE	f386fb4d42 Create LICENSE	před 5 roky
README.md	ec7c9165ac Restyled by prettier-markdown	před 5 roky
requirements.txt	6a4a7e8831 Change deps - testing	před 5 roky
setup.py	fb4ef759ff Initial commit	před 5 roky

Tesseract

Distributed training of large neural networks across volunteer computers.

[WIP] - this branch is in progress of updating. If you're interested in supplementary code for Learning@home paper, you can find it at https://github.com/mryab/learning-at-home .

What do I need to run it?

One or several computers, each equipped with at least one GPU
Each computer should have at least two open ports (if not, consider ssh port forwarding)
Some popular Linux x64 distribution
- Tested on Ubuntu16.04, should work fine on any popular linux64 and even MacOS;
- Running on Windows natively is not supported, please use vm or docker;

How do I run it?

Currently, there isn't any way to do it easily. There are some tests (you can look into CI logs and/or config) and we want to expand them, but if you want to do something complex with it, you're on your own.

tesseract quick tour

Trainer process:

RemoteExpert(tesseract/client/remote_expert.py) behaves like a pytorch module with autograd support but actually sends request to a remote runtime.
GatingFunction(tesseract/client/gating_function.py) finds best experts for a given input and either returns them as RemoteExpert or applies them right away.

Runtime process:

TesseractRuntime (tesseract/runtime/__init__.py) aggregates batches and performs inference/training of experts according to their priority.
TesseractServer (tesseract/server/__init__.py) wraps runtime and periodically uploads experts into TesseractNetwork.

DHT:

TesseractNetwork(tesseract/network/__init__.py) is a node of Kademlia-based DHT that stores metadata used by trainer and runtime.

Limitations

DHT:

DHT functionality is severely limited by its inability to traverse NAT.
Because of this all the features that require DHT are in deep pre-alpha state and cannot be used without special setup.

README.md

Tesseract

What do I need to run it?

How do I run it?

tesseract quick tour

Limitations