From c2ad6e9d8df50e144145b13307b25a8f3dbf87ee Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Wed, 11 May 2022 03:48:12 +0800 Subject: [PATCH] Update README.md --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 14ea5ee..d86f34d 100644 --- a/README.md +++ b/README.md @@ -6,12 +6,12 @@ RWKV v2 is a RNN with Transformer-level performance, which can also be directly So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, and fast training. Moreover you get a free sentence embedding. -I am training it on the Pile: https://github.com/BlinkDL/RWKV-v2-RNN-Pile - -It might reach GPT-Neo performance within 100B tokens: +I am training it on the Pile (https://github.com/BlinkDL/RWKV-v2-RNN-Pile) and it might reach GPT-Neo performance within 100B tokens: ![RWKV-v2-430M-Pile](RWKV-v2-430M-Pile.png) +All of the trained models will be open-source. Inference is very fast (only matrix-vector multiplications, no matrix-matrix multiplications) even on CPUs, and I believe you can run a 1B params RWKV-v2-RNN with reasonable speed on your phone. + See the release for a 27M params model on enwik8 with 0.72 BPC(dev). ## How it works