From 3fc16a86edd336dbadba26c88bd515b942cced8d Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Mon, 9 Jan 2023 04:30:38 +0800 Subject: [PATCH] Update README.md --- README.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 333ef59..72393c4 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,12 @@ RWKV is a RNN with Transformer-level LLM performance, which can also be directly So it's combining the best of RNN and transformer - **great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding** (using the final hidden state). +**Download RWKV-4 0.1/0.4/1.5/3/7/14B weights**: https://huggingface.co/BlinkDL + +I am training RWKV-4 14B on the Pile: https://wandb.ai/blinkdl/RWKV-v4-Pile + +![RWKV-eval2](RWKV-eval2.png) + RWKV-3 1.5B on A40 (tf32) = always 0.015 sec/token, tested using simple pytorch code (no CUDA), GPU utilization 45%, VRAM 7823M GPT2-XL 1.3B on A40 (tf32) = 0.032 sec/token (for ctxlen 1000), tested using HF, GPU utilization 45% too (interesting), VRAM 9655M @@ -20,10 +26,6 @@ You are welcome to join the RWKV discord https://discord.gg/bDSBUMeFpc to build Twitter: https://twitter.com/BlinkDL_AI -**Download RWKV-4 0.1/0.4/1.5/3/7/14B weights**: https://huggingface.co/BlinkDL - -I am training RWKV-4 14B on the Pile: https://wandb.ai/blinkdl/RWKV-v4-Pile - ![RWKV-eval](RWKV-eval.png) All of the trained models will be open-source. Inference is very fast (only matrix-vector multiplications, no matrix-matrix multiplications) even on CPUs, so you can even run a LLM on your phone.