justheuristic 2146fb6d0e wip: implement grad wrt logits пре 5 година
..
__init__.py c58d08cc06 remove run_and_await_k completely, rename gating_function to moe пре 5 година
expert.py 6fb99c8746 wip: parallel fault-tolerant moe backward pass пре 5 година
moe.py 2146fb6d0e wip: implement grad wrt logits пре 5 година