From 1f189a4034c7ff1d0c145ef556327de1fa90da51 Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Tue, 22 Mar 2022 04:55:04 +0800 Subject: [PATCH] Update README.md --- README.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/README.md b/README.md index cd0b010..765f78f 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,21 @@ # The RWKV Language Model +## v2 + +RWKV v2 is a RNN which can also be directly trained like a GPT transformer. + +You only need x_t, a_t, b_t of position t to compute the vectors for position t+1. + +Hence it can be 100x faster than GPT, and 100x more VRAM friendly. + +I AM STILL TRAINING A LM TO TEST ITS CONVERGENCE. + +The model: + +![RWKV-v2-RNN](RWKV-v2-RNN.png) + +## v1 + We propose the RWKV language model, with alternating time-mix and channel-mix layers: