From df079bceb0e91ccca46d5d8893851801252900d7 Mon Sep 17 00:00:00 2001
From: Mikko Juola <mikjuo@gmail.com>
Date: Mon, 13 Mar 2023 13:05:32 -0700
Subject: [PATCH] Add records of my benchmarks to README.md so I can compare it
 later.

---
 README.md | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/README.md b/README.md
index bbccbeb..3fb5c09 100644
--- a/README.md
+++ b/README.md
@@ -74,3 +74,21 @@ This is a hobby thing for me so don't expect updates or help.
   initial prompt. I don't know if this start-up time can be eliminated
   completely but it could be cached on disk. Use cases like having a standard
   prompt to prime the text generation that you reuse many times.
+
+# Benchmarks
+
+I'm trying to track that I'm making this faster and not slower.
+
+For 50-length sequence generation:
+
+```
+cargo run --release --
+          --model-path /LLaMA/13B \
+          --param-path /LLaMA/13B/params.json \
+          --tokenizer-path /LLaMA/tokenizer.model \
+          --prompt "Computers are pretty complica" --max-seq-len 50
+
+# commit c9c861d199bd2d87d7e883e3087661c1e287f6c4  (13 March 2023)
+LLaMA-7B:  AMD Ryzen 3950X: 1058ms / token
+LLaMA-13B: AMD Ryzen 3950X: 2005ms / token
+```