Update README.md

main
PENG Bo 4 years ago committed by GitHub
parent 1035a7438e
commit bcd4adb781
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -10,7 +10,7 @@ alt="\begin{align*}
\end{align*}
">
* Here R, K, V is generated by linear transforms of input.
* Here R, K, V is generated by linear transforms of input. Basically RWKV decomposes attention into R(target) * W(src -> target) * K(src). So I call R "receptance", and sigmoid means it's in 0~1 range.
* The Time-mix is similar to AFT (https://arxiv.org/abs/2105.14103). There are two differences.

Loading…
Cancel
Save