From 3f158ac7361bdd0424538bda83a6095e0ad1c7a2 Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Sun, 17 Jul 2022 07:00:23 +0800 Subject: [PATCH] Update README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 33a6743..35c4f5f 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ How it works: RWKV gathers information to a number of channels, which are also d ## Join our Discord: https://discord.gg/bDSBUMeFpc :) -You are welcome to join the RWKV discord https://discord.gg/bDSBUMeFpc to build upon it. We have plenty of potential compute (A100 40Gs) now (thanks to CoreWeave), so if you have interesting ideas I can run them. I am also looking for CUDA gurus to optimize the kernel. Thank you. +You are welcome to join the RWKV discord https://discord.gg/bDSBUMeFpc to build upon it. We have plenty of potential compute (A100 40Gs) now (thanks to CoreWeave), so if you have interesting ideas I can run them. I am also looking for CUDA gurus to optimize the kernel (https://github.com/BlinkDL/RWKV-CUDA). Thank you. Here are some of my TODOs. Let's work together :) @@ -113,7 +113,7 @@ x = x + self.att(self.ln1(x)) x = x + self.ffn(self.ln2(x)) ``` -I need a better CUDA kernel to (1) pull off maxK so there's need to clamp k to 60. (2) fix divide-by-zero without using K_EPS. (3) support bf16/fp16. **Please let me know if you are a CUDA expert :)** +I need a better CUDA kernel (https://github.com/BlinkDL/RWKV-CUDA) to (1) pull off maxK so there's need to clamp k to 60. (2) fix divide-by-zero without using K_EPS. (3) support bf16/fp16. **Please let me know if you are a CUDA expert :)** Removing the maxK limitation will also make it easy to clean the state of a KV-V channel, by using a huge K.