RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Go to file

BlinkDL aa4e2a68f4 first commit		5 years ago
src	first commit	5 years ago
.gitignore	first commit	5 years ago
LICENSE	Initial commit	5 years ago
README.md	Initial commit	5 years ago
RWKV-vs-MHA.png	first commit	5 years ago
train.py	first commit	5 years ago

README.md

RWKV-LM

The RWKV Language Model