BlinkDL
|
3b9005ea11
|
RWKV: now faster and less params
|
4 years ago |
BlinkDL
|
546114c6a5
|
still use layernorm for everything
|
4 years ago |
BlinkDL
|
fd098b1d2e
|
small update
|
4 years ago |
BlinkDL
|
3b60c5b266
|
add wandb, and rename variables
|
4 years ago |
BlinkDL
|
440bebff1a
|
fixed nan in large models
|
4 years ago |
BlinkDL
|
62e2cb06d6
|
fixing nan in large models
|
4 years ago |
BlinkDL
|
d699a69169
|
misc improvements
|
4 years ago |
BlinkDL
|
6266f481da
|
minor changes
|
4 years ago |
BlinkDL
|
89eab46e60
|
+ info
|
4 years ago |
BlinkDL
|
e9fbd9bf70
|
remove layernorm -> better RWKV
|
4 years ago |
BlinkDL
|
447eae5841
|
add MHA-plus model
|
4 years ago |
BlinkDL
|
aa4e2a68f4
|
first commit
|
4 years ago |