212 Commits (46eebd98cabf80de44d9c34f007a8d0612e31973)
 

Author SHA1 Message Date
BlinkDL b48aa1d430 no message 4 years ago
BlinkDL a19be54bf5 no message 4 years ago
BlinkDL fcd01f8851 no message 4 years ago
BlinkDL 76e241b71e saves vocab.json, and the model every X epoch 4 years ago
PENG Bo 689a6a924d
Update train.py 4 years ago
PENG Bo 34fa2ec81b
Update README.md 4 years ago
PENG Bo 58bdb908f9
Update README.md 4 years ago
PENG Bo 3d8d0373b4
Update README.md 4 years ago
BlinkDL 710d3e34b7 better init for RWKV 4 years ago
BlinkDL 619ed00e4b misc improvement 4 years ago
PENG Bo a36fc09fea
Update README.md 4 years ago
PENG Bo a91084efa9
Update README.md 4 years ago
BlinkDL 3329161ed7 rapid convergence using ZERO initialization 4 years ago
BlinkDL 7f391c5758 + RWKV tiny-attn and now it's great for ctx 1024 or 2048 4 years ago
PENG Bo a9f39c112c
Update README.md 4 years ago
PENG Bo 8fd4601dea
Update README.md 4 years ago
BlinkDL 9b903db103 Merge branch 'main' of https://github.com/BlinkDL/RWKV-LM into main 4 years ago
BlinkDL 8aec414db2 no message 4 years ago
PENG Bo 9e959d0b8a
Update README.md 4 years ago
BlinkDL 4ffd8f1b76 + new comparison 4 years ago
PENG Bo 04852faf04
Update README.md 4 years ago
BlinkDL ad627311f4 clean init code 4 years ago
BlinkDL c675b47705 misc improvements 4 years ago
BlinkDL ef29f4b9e8 fixed nan loss 4 years ago
BlinkDL 4fd8716976 improve RWKV time_w initialization 4 years ago
PENG Bo 1ea53a2f03
Update README.md 4 years ago
BlinkDL a31a3b2e92 + MHA_shift 4 years ago
PENG Bo 4096fff9ee
Update README.md 4 years ago
PENG Bo 12ba06216d
Update README.md 4 years ago
PENG Bo 639de69256
Create CITATION.cff 4 years ago
PENG Bo 994170685b
Update README.md 4 years ago
BlinkDL 3b9005ea11 RWKV: now faster and less params 4 years ago
BlinkDL 546114c6a5 still use layernorm for everything 4 years ago
PENG Bo c68ea168b1
Update README.md 4 years ago
PENG Bo 73a63e175f
Update README.md 4 years ago
PENG Bo 2df321d3f4
Update README.md 4 years ago
PENG Bo 6e2ba61d95
Update README.md 4 years ago
PENG Bo cd9b352b45
Update README.md 4 years ago
PENG Bo d2b100c2ac
Update README.md 4 years ago
PENG Bo 8af6289d0c
Update README.md 4 years ago
BlinkDL fd098b1d2e small update 4 years ago
PENG Bo 3b01c8c3cf
Update README.md 4 years ago
BlinkDL 65eda0f915 no message 4 years ago
BlinkDL 3b60c5b266 add wandb, and rename variables 4 years ago
BlinkDL 440bebff1a fixed nan in large models 4 years ago
PENG Bo f80ff53595
Update README.md 4 years ago
BlinkDL 62e2cb06d6 fixing nan in large models 4 years ago
BlinkDL d699a69169 misc improvements 4 years ago
BlinkDL 6266f481da minor changes 4 years ago
PENG Bo 88297e7949
Update README.md 4 years ago