22 Commits (8e63b75f2c1c2f824d4b2e1004ec2cf9d0bdf975)

Author SHA1 Message Date
BlinkDL 50587bd65f fix for jit modules 3 years ago
BlinkDL 2b4539cd08 faster (use torch 1.12.1+cu116 or newer) 3 years ago
BlinkDL 2f33901c10 no message 3 years ago
BlinkDL 6ff859db80 no message 3 years ago
BlinkDL 09c76b185a no message 3 years ago
BlinkDL 8cced0383e no message 3 years ago
BlinkDL 2815260d83 better 3 years ago
BlinkDL dc7e0802d0 faster 3 years ago
BlinkDL c43a17cfb3 10% faster training 3 years ago
BlinkDL c84e8fd952 bugfix 3 years ago
BlinkDL 73b96705d7 + fp32 mode (slow but good for verification) 3 years ago
BlinkDL a1bf15ac40 no message 3 years ago
BlinkDL 6299c087a4 fixed VRAM consumpition 3 years ago
BlinkDL c1f7a72724 saves some VRAM for 1 GPU training 3 years ago
BlinkDL 68c486ad10 supports RWKV-4 pile models 3 years ago
BlinkDL 61b7c429df no message 3 years ago
BlinkDL 7cdc8d3164 no message 3 years ago
BlinkDL 13bb641007 no message 3 years ago
BlinkDL f79137b524 supports megatron bin+idx format 3 years ago
BlinkDL 083f9504c6 + bf16 mode (more stable) 3 years ago
BlinkDL 6667ad18c2 more training tips 3 years ago
BlinkDL 165dfd1b9e RWKV-4 with DeepSpeed & FP16 & Better CUDA Kernel 3 years ago