Add records of my benchmarks to README.md so I can compare it later.

broken-opencl-code
Mikko Juola 3 years ago
parent c9c861d199
commit df079bceb0

@ -74,3 +74,21 @@ This is a hobby thing for me so don't expect updates or help.
initial prompt. I don't know if this start-up time can be eliminated initial prompt. I don't know if this start-up time can be eliminated
completely but it could be cached on disk. Use cases like having a standard completely but it could be cached on disk. Use cases like having a standard
prompt to prime the text generation that you reuse many times. prompt to prime the text generation that you reuse many times.
# Benchmarks
I'm trying to track that I'm making this faster and not slower.
For 50-length sequence generation:
```
cargo run --release --
--model-path /LLaMA/13B \
--param-path /LLaMA/13B/params.json \
--tokenizer-path /LLaMA/tokenizer.model \
--prompt "Computers are pretty complica" --max-seq-len 50
# commit c9c861d199bd2d87d7e883e3087661c1e287f6c4 (13 March 2023)
LLaMA-7B: AMD Ryzen 3950X: 1058ms / token
LLaMA-13B: AMD Ryzen 3950X: 2005ms / token
```

Loading…
Cancel
Save