diff --git a/README.md b/README.md index bbccbeb..3fb5c09 100644 --- a/README.md +++ b/README.md @@ -74,3 +74,21 @@ This is a hobby thing for me so don't expect updates or help. initial prompt. I don't know if this start-up time can be eliminated completely but it could be cached on disk. Use cases like having a standard prompt to prime the text generation that you reuse many times. + +# Benchmarks + +I'm trying to track that I'm making this faster and not slower. + +For 50-length sequence generation: + +``` +cargo run --release -- + --model-path /LLaMA/13B \ + --param-path /LLaMA/13B/params.json \ + --tokenizer-path /LLaMA/tokenizer.model \ + --prompt "Computers are pretty complica" --max-seq-len 50 + +# commit c9c861d199bd2d87d7e883e3087661c1e287f6c4 (13 March 2023) +LLaMA-7B: AMD Ryzen 3950X: 1058ms / token +LLaMA-13B: AMD Ryzen 3950X: 2005ms / token +```