diff --git a/README.md b/README.md index 415fcef..b4cfe95 100644 --- a/README.md +++ b/README.md @@ -192,9 +192,11 @@ out.write(ss + "\n") ### List of some ideas -1. Use complex number (so, rotation instead of decay) in some channels. +1. Time decay changed to something like (0.9999^T + 0.1) where 0.9999 and 0.1 are learnable. The 0.1 part will be kept forever. -2. Inject some trainable and interpolable positional encoding? +2. Use complex number (so, rotation instead of decay) in some channels. + +3. Inject some trainable and interpolable positional encoding? ### Misc