From b4d5cf91a772d213ac5c7a7c6d75828392ccedd6 Mon Sep 17 00:00:00 2001
From: Mikko Juola <mikjuo@gmail.com>
Date: Mon, 13 Mar 2023 17:45:23 -0700
Subject: [PATCH] Mention in README.md that using OpenCL does not cast weights
 to 32-bit floats.

---
 README.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index d2231ca..f399f56 100644
--- a/README.md
+++ b/README.md
@@ -50,7 +50,8 @@ cargo run --release -- --tokenizer-model /path/to/tokenizer.model --model-path /
 ```
 
 Right now it seems to use around ~25 gigabytes of memory for 7B and around ~50
-gigabytes for 13B. Internally all weights are cast to 32-bit floats.
+gigabytes for 13B. If you don't use OpenCL, then all parameters are cast to
+32-bit floats.
 
 You can use `--temperature`, `--top-p` and `--top-k` to adjust token sampler
 settings.