From 9e959d0b8a3d12721af178adb73e35afd8e68702 Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Tue, 17 Aug 2021 22:46:46 +0800 Subject: [PATCH] Update README.md --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 4e97c47..c1ad5e2 100644 --- a/README.md +++ b/README.md @@ -68,13 +68,13 @@ Character-level loss on simplebooks-92 dataset https://dldata-public.s3.us-east- ![RWKV-vs-MHA](RWKV-vs-MHA.png) -Gray: usual MHA+Rotary+GeGLU - performance not as good. +Gray: usual MHA+Rotary+GeGLU - performance not as good. 17.2M params. -Red: RWKV ("linear" attention) - VRAM friendly - quite faster when ctx window is long - good performance. +Red: RWKV ("linear" attention) - VRAM friendly - quite faster when ctx window is long - good performance. 16.6M params. -Black: MHA_pro (MHA with various tweaks & RWKV-type-FFN) - slow - needs more VRAM - good performance. +Green: MHA+Rotary+GeGLU+Token_shift. 17.2M params. -parameters count: 17.2 vs 18.5 vs 18.5. +Blue: MHA_pro (MHA with various tweaks & RWKV-type-FFN) - slow - needs more VRAM - good performance. 16.6M params. ``` @software{peng_bo_2021_5196578,