Add records of my benchmarks to README.md so I can compare it later.

3 years ago · df079bceb0
parent c9c861d199
commit df079bceb0
1 changed files with 18 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -74,3 +74,21 @@ This is a hobby thing for me so don't expect updates or help.
  initial prompt. I don't know if this start-up time can be eliminated
  completely but it could be cached on disk. Use cases like having a standard
  prompt to prime the text generation that you reuse many times.
 # Benchmarks
 I'm trying to track that I'm making this faster and not slower.
 For 50-length sequence generation:
 ```
 cargo run --release --
          --model-path /LLaMA/13B \
          --param-path /LLaMA/13B/params.json \
          --tokenizer-path /LLaMA/tokenizer.model \
          --prompt "Computers are pretty complica" --max-seq-len 50
 # commit c9c861d199bd2d87d7e883e3087661c1e287f6c4  (13 March 2023)
 LLaMA-7B:  AMD Ryzen 3950X: 1058ms / token
 LLaMA-13B: AMD Ryzen 3950X: 2005ms / token
 ```