Update README.md

4 years ago · 09132dea52
parent 1a49ec4eeb
commit 09132dea52
1 changed files with 6 additions and 4 deletions
--- a/README.md
+++ b/README.md
@ -101,10 +101,6 @@ I need a better CUDA kernel to (1) pull off maxK so there's need to clamp k to 6

 Removing the maxK limitation will also make it easy to clean the state of a KV-V channel, by using a huge K.

-Namely, this is my plan:
-
-![RWKV-v3-plan](RWKV-v3-plan.png)
-
 ## Explaining the code for RWKV v2+ GPT mode

 Note: this is for the latest v2+ model.
@ -203,6 +199,12 @@ return rkv
 ```
 The self.value, self.receptance matrices are all initialized to zero.

+## Towards RWKV-3
+
+RWKV-3 will work under FP16.
+
+![RWKV-v3-plan](RWKV-v3-plan.png)
+
 ## From GPT to RWKV-2 (the formulas)

 Let F[t] be the system state at t.