You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
RWKV-LM/RWKV-v3
BlinkDL 8556d0fd3f no message 4 years ago
..
cuda RWKV-3 (test deeper models (n_layer >= 12) to see the advantage) 4 years ago
src tips for training, with exponential lr decay 4 years ago
run.py RWKV-3 (test deeper models (n_layer >= 12) to see the advantage) 4 years ago
train.py no message 4 years ago
verify.py RWKV-3 (test deeper models (n_layer >= 12) to see the advantage) 4 years ago