From 3fc16a86edd336dbadba26c88bd515b942cced8d Mon Sep 17 00:00:00 2001
From: PENG Bo <33809201+BlinkDL@users.noreply.github.com>
Date: Mon, 9 Jan 2023 04:30:38 +0800
Subject: [PATCH] Update README.md

---
 README.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 333ef59..72393c4 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,12 @@ RWKV is a RNN with Transformer-level LLM performance, which can also be directly
 
 So it's combining the best of RNN and transformer - **great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding** (using the final hidden state).
 
+**Download RWKV-4 0.1/0.4/1.5/3/7/14B weights**: https://huggingface.co/BlinkDL
+
+I am training RWKV-4 14B on the Pile: https://wandb.ai/blinkdl/RWKV-v4-Pile
+
+![RWKV-eval2](RWKV-eval2.png)
+
 RWKV-3 1.5B on A40 (tf32) = always 0.015 sec/token, tested using simple pytorch code (no CUDA), GPU utilization 45%, VRAM 7823M
 
 GPT2-XL 1.3B on A40 (tf32) = 0.032 sec/token (for ctxlen 1000), tested using HF, GPU utilization 45% too (interesting), VRAM 9655M
@@ -20,10 +26,6 @@ You are welcome to join the RWKV discord https://discord.gg/bDSBUMeFpc to build
 
 Twitter: https://twitter.com/BlinkDL_AI
 
-**Download RWKV-4 0.1/0.4/1.5/3/7/14B weights**: https://huggingface.co/BlinkDL
-
-I am training RWKV-4 14B on the Pile: https://wandb.ai/blinkdl/RWKV-v4-Pile
-
 ![RWKV-eval](RWKV-eval.png)
 
 All of the trained models will be open-source. Inference is very fast (only matrix-vector multiplications, no matrix-matrix multiplications) even on CPUs, so you can even run a LLM on your phone.