9 Commits (62e2cb06d677ab8fd4c865a7c6b860bd1349c7eb)

Author SHA1 Message Date
BlinkDL 62e2cb06d6 fixing nan in large models 4 years ago
BlinkDL d699a69169 misc improvements 4 years ago
BlinkDL 6266f481da minor changes 4 years ago
BlinkDL 89eab46e60 + info 4 years ago
BlinkDL e9fbd9bf70 remove layernorm -> better RWKV 4 years ago
BlinkDL 55405c57d0 better splitting of words 4 years ago
BlinkDL 01d6972f4f now works for word-level LM 4 years ago
BlinkDL 447eae5841 add MHA-plus model 4 years ago
BlinkDL aa4e2a68f4 first commit 4 years ago