From df079bceb0e91ccca46d5d8893851801252900d7 Mon Sep 17 00:00:00 2001 From: Mikko Juola Date: Mon, 13 Mar 2023 13:05:32 -0700 Subject: [PATCH] Add records of my benchmarks to README.md so I can compare it later. --- README.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/README.md b/README.md index bbccbeb..3fb5c09 100644 --- a/README.md +++ b/README.md @@ -74,3 +74,21 @@ This is a hobby thing for me so don't expect updates or help. initial prompt. I don't know if this start-up time can be eliminated completely but it could be cached on disk. Use cases like having a standard prompt to prime the text generation that you reuse many times. + +# Benchmarks + +I'm trying to track that I'm making this faster and not slower. + +For 50-length sequence generation: + +``` +cargo run --release -- + --model-path /LLaMA/13B \ + --param-path /LLaMA/13B/params.json \ + --tokenizer-path /LLaMA/tokenizer.model \ + --prompt "Computers are pretty complica" --max-seq-len 50 + +# commit c9c861d199bd2d87d7e883e3087661c1e287f6c4 (13 March 2023) +LLaMA-7B: AMD Ryzen 3950X: 1058ms / token +LLaMA-13B: AMD Ryzen 3950X: 2005ms / token +```