7 Commits (f80ff535956f8290ed74379972cba6d692fec04e)

Author SHA1 Message Date
BlinkDL 62e2cb06d6 fixing nan in large models 4 years ago
BlinkDL d699a69169 misc improvements 4 years ago
BlinkDL 6266f481da minor changes 4 years ago
BlinkDL 89eab46e60 + info 4 years ago
BlinkDL e9fbd9bf70 remove layernorm -> better RWKV 4 years ago
BlinkDL 447eae5841 add MHA-plus model 4 years ago
BlinkDL aa4e2a68f4 first commit 4 years ago