Mikko Juola
846759b277
Optimize conversions to and from f16<->32.
...
x86 cannot do f16 operations natively, but it does have an instruction
to convert them to f32. I optimized those to use SIMD instructions.
3 years ago
Mikko Juola
8acb9f32b8
Update README.md for new discoveries.
3 years ago
Mikko Juola
26d5309cf7
Add support for bigger models.
...
I've tested with 13B LLaMA model and it seems to work.
There was a bug in unpickler that skipped over tuples of size 1. I had
written bunch of code assuming there is no bug which I fixed and removed
some unpickling code.
I added functions to tensor.rs to be able construct tensors out of
multiple files.
3 years ago
Mikko Juola
8a427bcb21
The project is actually called rllama, put that in readme.md.
3 years ago
Mikko Juola
18ef805458
Read parameters from model's JSON file instead of hard-coding them, make max sequence length configurable.
3 years ago
Mikko Juola
f103871bc0
Make the output colored. This is essential to be taken seriously.
...
Also did some clippy happiness changes.
3 years ago
Mikko Juola
cd28aba5e2
Make the output look nicer.
3 years ago
Mikko Juola
d7a3f57510
Update README.md, add multithreading and optimizations to some operations, allow loading prompt from a file.
3 years ago
Mikko Juola
8bb9404168
Update README to clarify this is a Rust project and to show how to change temperature, top_k, top_p stuff.
3 years ago
Mikko Juola
f6217e0036
Add readme, make clippy happy.
3 years ago
Mikko Juola
3b8f904f13
First commit. LLaMA works now. It is not pretty but it does generate text from prompts. Yay.
3 years ago