From 8bb940416869a821d4e320ef3dd2b8b11024f083 Mon Sep 17 00:00:00 2001 From: Mikko Juola Date: Sat, 11 Mar 2023 00:47:32 -0800 Subject: [PATCH] Update README to clarify this is a Rust project and to show how to change temperature, top_k, top_p stuff. --- README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 768401e..06134fa 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,9 @@ Well sort of, it doesn't unzip them automatically (see below). # How to run -You will need the LLaMA-7B weights first. Refer to https://github.com/facebookresearch/llama/ +You will need Rust. Make sure you can run `cargo` from a command line. + +You will need to download LLaMA-7B weights. Refer to https://github.com/facebookresearch/llama/ Once you have 7B weights, and the `tokenizer.model` it comes with, you need to decompress it. @@ -36,6 +38,9 @@ cargo run --release -- --tokenizer-model /path/to/tokenizer.model --model-path / Right now it seems to use around ~25 gigabytes of memory. Internally all weights are cast to 32-bit floats. +You can use `--temperature`, `--top-p` and `--top-k` to adjust token sampler +settings. + # Future plans This is a hobby thing for me so don't expect updates or help.