Logo
Explore Help
Sign In
novarobot
/
RWKV-LM
1
0
Fork
You've already forked RWKV-LM
0
Code Issues Pull Requests Packages Projects Releases Wiki Activity
55 Commits
1 Branch
4 Tags
11 MiB
main
0.01
0.02
2.00
4.00
Branches Tags
${ item.name }
Create tag ${ searchTerm }
Create branch ${ searchTerm }
from '3d8d0373b4'
${ noResults }
Commit Graph

22 Commits (3d8d0373b494229cfa1d212f4151edde5ec89e21)

Author SHA1 Message Date
BlinkDL 710d3e34b7 better init for RWKV 4 years ago
BlinkDL 619ed00e4b misc improvement 4 years ago
BlinkDL 3329161ed7 rapid convergence using ZERO initialization 4 years ago
BlinkDL 7f391c5758 + RWKV tiny-attn and now it's great for ctx 1024 or 2048 4 years ago
BlinkDL 4ffd8f1b76 + new comparison 4 years ago
BlinkDL ad627311f4 clean init code 4 years ago
BlinkDL c675b47705 misc improvements 4 years ago
BlinkDL ef29f4b9e8 fixed nan loss 4 years ago
BlinkDL 4fd8716976 improve RWKV time_w initialization 4 years ago
BlinkDL a31a3b2e92 + MHA_shift 4 years ago
BlinkDL 3b9005ea11 RWKV: now faster and less params 4 years ago
BlinkDL 546114c6a5 still use layernorm for everything 4 years ago
BlinkDL fd098b1d2e small update 4 years ago
BlinkDL 3b60c5b266 add wandb, and rename variables 4 years ago
BlinkDL 440bebff1a fixed nan in large models 4 years ago
BlinkDL 62e2cb06d6 fixing nan in large models 4 years ago
BlinkDL d699a69169 misc improvements 4 years ago
BlinkDL 6266f481da minor changes 4 years ago
BlinkDL 89eab46e60 + info 4 years ago
BlinkDL e9fbd9bf70 remove layernorm -> better RWKV 4 years ago
BlinkDL 447eae5841 add MHA-plus model 4 years ago
BlinkDL aa4e2a68f4 first commit 4 years ago
Powered by Forgejo Version: 1.19.3-0 Page: 471ms Template: 9ms
English
Bahasa Indonesia Deutsch English Español Français Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API