RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
BlinkDL aa4e2a68f4 first commit 4 years ago
src first commit 4 years ago
.gitignore first commit 4 years ago
LICENSE Initial commit 4 years ago
README.md Initial commit 4 years ago
RWKV-vs-MHA.png first commit 4 years ago
train.py first commit 4 years ago

README.md

RWKV-LM

The RWKV Language Model