|
|
3 years ago | |
|---|---|---|
| llama | 3 years ago | |
| model | 3 years ago | |
| README.md | 3 years ago | |
| example-chat.py | 3 years ago | |
| merge-weights.py | 3 years ago | |
| requirements.txt | 3 years ago | |
| setup.py | 3 years ago | |
README.md
Chat with Meta's LLaMA models at home made easy
This repository is a chat example with LLaMA (arXiv) models running on a typical home PC. You will just need a NVIDIA videocard and some RAM to chat with model.
Conda Environment Setup Example for Windows 10+
Download and install Anaconda Python https://www.anaconda.com and run Anaconda Prompt
conda create -n llama python=3.10
conda activate llama
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
Setup
In a conda env with pytorch / cuda available, run
pip install -r requirements.txt
Then in this repository
pip install -e .
Download tokenizer and models
magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA
or
magnet:xt=urn:btih:b8287ebfa04f879b048d4d4404108cf3e8014352&dn=LLaMA&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce
Prepare model
First, you need to unshard model checkpoints to a single file. Let's do this for 30B model.
python merge_weights.py --input_dir D:\Downloads\LLaMA --model_size 30B
In this example, D:\Downloads\LLaMA is a root folder of downloaded torrent with weights.
This will create merged.pth file in the root folder of this repo.
Place this file and corresponding (torrentroot)/30B/params.json of model into [/model] folder.
Place (torrentroot)/tokenizer.model file to the [/tokenizer] folder of this repo. Now you are ready to go.
python example-chat.py ./model ./tokenizer/tokenizer.model