BlinkDL
|
94f618c52a
|
Merge branch 'main' of https://github.com/BlinkDL/RWKV-LM
|
3 years ago |
BlinkDL
|
a1bf15ac40
|
no message
|
3 years ago |
PENG Bo
|
e05c69452d
|
Update README.md
|
3 years ago |
PENG Bo
|
5f1a473845
|
Update README.md
|
3 years ago |
BlinkDL
|
6299c087a4
|
fixed VRAM consumpition
|
3 years ago |
PENG Bo
|
cb520e0f15
|
Update README.md
|
3 years ago |
PENG Bo
|
a01d915fcc
|
Update README.md
|
3 years ago |
PENG Bo
|
4c2080aadd
|
Add files via upload
|
3 years ago |
BlinkDL
|
c1f7a72724
|
saves some VRAM for 1 GPU training
|
3 years ago |
PENG Bo
|
69e7cfbf39
|
Update README.md
|
3 years ago |
PENG Bo
|
f757518277
|
Update README.md
|
3 years ago |
PENG Bo
|
2bddd576cd
|
Update README.md
|
3 years ago |
PENG Bo
|
1949ed8619
|
Update README.md
|
3 years ago |
BlinkDL
|
68c486ad10
|
supports RWKV-4 pile models
|
3 years ago |
BlinkDL
|
61b7c429df
|
no message
|
3 years ago |
PENG Bo
|
8d4fed7128
|
Update README.md
|
3 years ago |
BlinkDL
|
7cdc8d3164
|
no message
|
3 years ago |
BlinkDL
|
13bb641007
|
no message
|
3 years ago |
BlinkDL
|
f79137b524
|
supports megatron bin+idx format
|
3 years ago |
PENG Bo
|
aa67870849
|
Update README.md
|
3 years ago |
BlinkDL
|
4c8a1a467b
|
no message
|
3 years ago |
BlinkDL
|
083f9504c6
|
+ bf16 mode (more stable)
|
3 years ago |
PENG Bo
|
46eebd98ca
|
Add files via upload
|
3 years ago |
PENG Bo
|
c0c4ffc7b4
|
Update README.md
|
3 years ago |
BlinkDL
|
6667ad18c2
|
more training tips
|
3 years ago |
PENG Bo
|
5f6e9356a2
|
Update README.md
|
3 years ago |
BlinkDL
|
165dfd1b9e
|
RWKV-4 with DeepSpeed & FP16 & Better CUDA Kernel
|
3 years ago |
PENG Bo
|
dfb75dd89d
|
Update README.md
|
3 years ago |
PENG Bo
|
e9488edafd
|
Update README.md
|
3 years ago |
PENG Bo
|
3f158ac736
|
Update README.md
|
3 years ago |
PENG Bo
|
ec91ff1857
|
Update README.md
|
3 years ago |
PENG Bo
|
276e322abf
|
Update README.md
|
3 years ago |
PENG Bo
|
a4b0759bf6
|
Update README.md
|
3 years ago |
PENG Bo
|
5bd56f1f2d
|
Add files via upload
|
3 years ago |
PENG Bo
|
3f699e6d2f
|
Update README.md
|
4 years ago |
PENG Bo
|
f4cd0a5a58
|
Update README.md
|
4 years ago |
PENG Bo
|
0ef922ebfc
|
Update README.md
|
4 years ago |
PENG Bo
|
19005e8067
|
Update README.md
|
4 years ago |
PENG Bo
|
14abfebc94
|
Update README.md
|
4 years ago |
PENG Bo
|
3ef305da6e
|
Update README.md
|
4 years ago |
PENG Bo
|
3de62b92c3
|
Update README.md
|
4 years ago |
PENG Bo
|
456437021f
|
Update README.md
|
4 years ago |
PENG Bo
|
b91cca965f
|
Add files via upload
|
4 years ago |
PENG Bo
|
4cb363e5aa
|
Update README.md
|
4 years ago |
BlinkDL
|
8d780208f2
|
typo fix
|
4 years ago |
BlinkDL
|
8556d0fd3f
|
no message
|
4 years ago |
BlinkDL
|
b6b5f4628f
|
tips for training, with exponential lr decay
|
4 years ago |
PENG Bo
|
f28be63cd8
|
Update README.md
|
4 years ago |
PENG Bo
|
b6403a8aef
|
RWKV-3 (test deeper models (n_layer >= 12) to see the advantage)
|
4 years ago |
PENG Bo
|
1f6461b90b
|
Update README.md
|
4 years ago |