From 22792b26cce04ef6309c326c73c7106561c63ea7 Mon Sep 17 00:00:00 2001 From: Mikko Juola Date: Mon, 13 Mar 2023 12:45:16 -0700 Subject: [PATCH] Add an idea about on-disk cache for initial prompt processing (not for weights). --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 80e1f9c..bbccbeb 100644 --- a/README.md +++ b/README.md @@ -70,3 +70,7 @@ This is a hobby thing for me so don't expect updates or help. * More sophisticated token sampling. I saw on Hackernews some comments how the samplers are kinda garbage and you can get much better results with good defaults and things like repetition penalty. +* There is an initial start-up time as the program has to pass through the + initial prompt. I don't know if this start-up time can be eliminated + completely but it could be cached on disk. Use cases like having a standard + prompt to prime the text generation that you reuse many times.