Update README.md

3 years ago · 434ddb1f04
parent 54f28abafe
commit 434ddb1f04
1 changed files with 1 additions and 7 deletions
--- a/README.md
+++ b/README.md
@ -127,10 +127,6 @@ x = x + self.att(self.ln1(x))
 x = x + self.ffn(self.ln2(x))
 ```

-I need a better CUDA kernel (https://github.com/BlinkDL/RWKV-CUDA) to (1) pull off maxK so there's need to clamp k to 60. (2) fix divide-by-zero without using K_EPS. (3) support bf16/fp16. **Please let me know if you are a CUDA expert :)**
-
-Removing the maxK limitation will also make it easy to clean the state of a KV-V channel, by using a huge K.
-
 ## Explaining the code for RWKV-3 GPT mode

 ### The GPT mode - overview
@ -227,9 +223,7 @@ return rkv
 ```
 The self.value, self.receptance matrices are all initialized to zero.

-## Towards RWKV-4
-
-RWKV-4 will work under FP16.
+## RWKV-4 improvements

 ![RWKV-v3-plan](RWKV-v3-plan.png)