From 0e2612680f865dd09e7e2e41d907f5ed2fc575bf Mon Sep 17 00:00:00 2001 From: randaller Date: Sun, 5 Mar 2023 17:56:22 +0300 Subject: [PATCH] Update README.md --- README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index c7d02fb..d7aa417 100755 --- a/README.md +++ b/README.md @@ -2,8 +2,6 @@ This repository is intended as a minimal, hackable and readable example to load [LLaMA](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/) ([arXiv](https://arxiv.org/abs/2302.13971v1)) models and run inference by using only CPU. Thus requires no videocard, but 64 (better 128 Gb) of RAM and modern processor is required. -At the moment only 7B model inference supported. - ### Conda Environment Setup Example for Windows 10+ Download and install Anaconda Python https://www.anaconda.com and run Anaconda Prompt ``` @@ -25,16 +23,18 @@ pip install -e . ### Download tokenizer and models magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA -### CPU Inference -Place tokenizer.model and tokenizer_checklist.chk into [/tokenizer] folder +### CPU Inference of 7B model +Place tokenizer.model and tokenizer_checklist.chk into repo's [/tokenizer] folder. -Place three files of 7B model into [/model] folder +Place consolidated.00.pth and params.json from 7B torrent folder into repo's [/model] folder. Run it: ``` python example-cpu.py ``` +### CPU Inference of 13B 30B 65B models + ### Model Card See [MODEL_CARD.md](MODEL_CARD.md)