From 5837ee32c4224f194fa1d19e6e7592680ce90f19 Mon Sep 17 00:00:00 2001
From: PENG Bo <33809201+BlinkDL@users.noreply.github.com>
Date: Sun, 22 Jan 2023 02:35:35 +0800
Subject: [PATCH] Update README.md

---
 README.md | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index f8d8803..7c5baa8 100644
--- a/README.md
+++ b/README.md
@@ -93,9 +93,9 @@ https://github.com/josephrocca/rwkv-v4-web
 
 For the old RWKV-2: see the release here for a 27M params model on enwik8 with 0.72 BPC(dev). Run run.py in https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v2-RNN. You can even run it in your browser: https://github.com/BlinkDL/AI-Writer/tree/main/docs/eng https://blinkdl.github.io/AI-Writer/eng/ (this is using tf.js WASM single-thread mode).
 
-I have an idea to quantize a matrix with outliers:
+I'd like to build an almost-INT8 version of RWKV. A simple method to quantize a matrix with outliers:
 ```python
-import numpy as np
+import numpy as npA
 
 # the original M, with outliers
 M = np.array([[1,   2,   1,  2],[2,  100,    2, 10],[1,   2,   1, 2],[2,   1, 20, 1]])
@@ -110,6 +110,9 @@ b = np.array([1, 5, 1, 1])
 v = np.array([1.23, 5.44, 9.75, 2.98])
 print(M.dot(v))
 print(Q.dot(v * a) * b)
+
+# even better: decompose M.dot(v) as Q.dot(v * a + aa) * b + bb where aa & bb are vectors too
+# and can apply more scaling to achieve W8A8 (example: https://arxiv.org/pdf/2211.10438.pdf)
 ```
 
 ### Training / Fine-tuning