Update README.md

main
PENG Bo 4 years ago committed by GitHub
parent 940a67ec27
commit 041227cdad
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -39,6 +39,8 @@ See the release here for a 27M params model on enwik8 with 0.72 BPC(dev). Run ru
### Training / Fine-tuning ### Training / Fine-tuning
Colab for fine-tuning: https://colab.research.google.com/drive/1BwceyZczs5hQr1wefmCREonEWhY-zeST
Training: https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v2-RNN Training: https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v2-RNN
You will be training the "GPT" version because it's paralleziable and faster to train. I find RWKV can extrapolate, so training with ctxLen 768 can work for ctxLen of 1000+. You can fine-tune the model with longer ctxLen and it can quickly adapt to longer ctxLens. You will be training the "GPT" version because it's paralleziable and faster to train. I find RWKV can extrapolate, so training with ctxLen 768 can work for ctxLen of 1000+. You can fine-tune the model with longer ctxLen and it can quickly adapt to longer ctxLens.

Loading…
Cancel
Save