You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
rllama/src
Mikko Juola 91dee4f114 Add --quiet flag, make colors respect --quiet so you just get the output and nothing else. 3 years ago
..
benches Make matrix multiplication multithreaded. 3 years ago
protomodels First commit. LLaMA works now. It is not pretty but it does generate text from prompts. Yay. 3 years ago
embedding.rs Add support for bigger models. 3 years ago
lib.rs Add some beginnings of OpenCL implementation. 3 years ago
main.rs First commit. LLaMA works now. It is not pretty but it does generate text from prompts. Yay. 3 years ago
rllama_main.rs Add --quiet flag, make colors respect --quiet so you just get the output and nothing else. 3 years ago
tensor.rs Make number of threads configurable and obtained by default from the system rather than hardcoding to 32. 3 years ago
tensor_opencl_support.rs Some code cleanup in OpenCL. 3 years ago
token_sampler.rs Improve matrix multiplication transposed further, this gives around ~10%-20% further increase by improving memory load to instruction ratio. 3 years ago
tokenizer.rs Improve matrix multiplication transposed further, this gives around ~10%-20% further increase by improving memory load to instruction ratio. 3 years ago
transformer.rs Make matrix multiplication multithreaded. 3 years ago
unpickler.rs Improve matrix multiplication transposed further, this gives around ~10%-20% further increase by improving memory load to instruction ratio. 3 years ago