Update README.md

main
randaller 3 years ago committed by GitHub
parent 23c0c719f8
commit 9c6dd13c23
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -21,11 +21,11 @@ Share your best prompts, chats or generations here in this issue: https://github
One may run with 32 Gb of RAM, but inference will be slow (with the speed of your swap file reading)
I am running this on 12700k/128 Gb RAM/NVIDIA 3070ti 8Gb/fast huge nvme and getting one token from 30B model in a few seconds.
I am running this on a [12700k/128 Gb RAM/NVIDIA 3070ti 8Gb/fast huge nvme with 256 Gb swap for 65B model] and getting one token from 30B model in a few seconds.
For example, **30B model uses around 70 Gb of RAM**. 7B model fits into 18 Gb. 13B model uses 48 Gb.
If you do not have powerful videocard, you may use another repo for cpu-only inference: https://github.com/randaller/llama-cpu
If you do not have nvidia videocard, you may use another repo for cpu-only inference: https://github.com/randaller/llama-cpu
### Conda Environment Setup Example for Windows 10+
Download and install Anaconda Python https://www.anaconda.com and run Anaconda Prompt

Loading…
Cancel
Save