justheuristic
|
284250d00c
change order of grads
|
5 سال پیش |
justheuristic
|
c5ee3d6041
only return grad w.r.t. inputs
|
5 سال پیش |
justheuristic
|
05e7c92f3d
unpack tuple
|
5 سال پیش |
justheuristic
|
5cbcf79b00
list -> tensor
|
5 سال پیش |
justheuristic
|
c8889bde96
list -> tensor
|
5 سال پیش |
justheuristic
|
8030c075c9
use lists for gatehr
|
5 سال پیش |
justheuristic
|
60af3952c9
flag to remove optimizer
|
5 سال پیش |
justheuristic
|
80ab75583f
wip: parallel fault-tolerant moe backward pass
|
5 سال پیش |
justheuristic
|
2b2ddf8280
wip: parallel fault-tolerant moe backward pass
|
5 سال پیش |
justheuristic
|
6fb99c8746
wip: parallel fault-tolerant moe backward pass
|
5 سال پیش |
justheuristic
|
c58d08cc06
remove run_and_await_k completely, rename gating_function to moe
|
5 سال پیش |