From a6a14f1bf93a3b205740649f26116e68c76ea344 Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Mon, 31 Oct 2022 15:30:35 +0800 Subject: [PATCH] Update README.md --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index ad4efa1..634490a 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ # The RWKV Language Model (and my tricks for LMs) -## RWKV: RNN with Transformer-level Performance +## RWKV: RNN with Transformer-level LLM Performance -RWKV is a RNN with Transformer-level performance, which can also be directly trained like a GPT transformer (parallelizable). And it's attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly computer the hidden state for the "RNN" mode. +RWKV is a RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly computer the hidden state for the "RNN" mode. So it's combining the best of RNN and transformer - **great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding** (using the final hidden state). @@ -20,7 +20,7 @@ You are welcome to join the RWKV discord https://discord.gg/bDSBUMeFpc to build **Download RWKV-4 0.1/0.4/1.5/3/7B weights**: https://huggingface.co/BlinkDL -I am training RWKV-4 7/14/24B on the Pile: https://wandb.ai/blinkdl/RWKV-v4-Pile +I am training RWKV-4 7/14B on the Pile: https://wandb.ai/blinkdl/RWKV-v4-Pile ![RWKV-eval](RWKV-eval.png)