From a16c6c3a7507e3c6c421cf0b5e7ba35a397227de Mon Sep 17 00:00:00 2001
From: PENG Bo <33809201+BlinkDL@users.noreply.github.com>
Date: Sat, 10 Sep 2022 21:50:22 +0800
Subject: [PATCH] Update README.md

---
 README.md | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 3c803fd..8d1115b 100644
--- a/README.md
+++ b/README.md
@@ -16,7 +16,7 @@ Training speed: RWKV-4 1.5B BF16 ctxlen1024 = 106K tokens/s on 8xA100 40G.
 
 You are welcome to join the RWKV discord https://discord.gg/bDSBUMeFpc to build upon it. We have plenty of potential compute (A100 40Gs) now (thanks to CoreWeave), so if you have interesting ideas I can run them.
 
-I am training RWKV-3 and RWKV-4 on the Pile (https://huggingface.co/BlinkDL):
+I am training RWKV-4 3B and 7B on the Pile (https://huggingface.co/BlinkDL).
 
 ![RWKV-v4-1.5B-Pile](RWKV-v4-1.5B-Pile.png)
 
@@ -28,10 +28,6 @@ How it works: RWKV gathers information to a number of channels, which are also d
 
 Here are some of my TODOs. Let's work together :)
 
-* Now we have RWKV-4 with DeepSpeedStage2 & FP16 & Better CUDA Kernel (100% faster training than tf32): https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v4. It will be great if someone can take a look to make it support multi-nodes and Stage3.
-
-* Scaling to 6B -> 20B -> 66B (there will be compute when we have the infrastructure). From the L12-D768 L24-D1024 L24-D2048 results, RWKV scales well.
-
 * HuggingFace integration, and optimized CPU & iOS & Android & WASM & WebGL inference. RWKV is a RNN and very friendly for edge devices. Let's make it possible to run a LLM on your phone.
 
 * Test it on bidirectional & MLM tasks, and image & audio & video tokens.