From 485c8dbba31427c1829b1a3a21b6c46c55f42b1b Mon Sep 17 00:00:00 2001 From: randaller Date: Sun, 5 Mar 2023 22:24:57 +0300 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 151ddc9..ba06c82 100755 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ Running model with single prompt on Windows computer equipped with 12700k, fast | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | 7B | 44 Gb, peak 56 Gb | 22 Gb | 170 seconds | 850 seconds | 23 seconds | | 13B | 77 Gb, peak 100 Gb | 38 Gb | 340 seconds | | 61 seconds | -| 30B | 180 Gb, peak 258 Gb | | 48 minutes | | 372 seconds | +| 30B | 180 Gb, peak 258 Gb | 89 Gb | 48 minutes | | 372 seconds | ### RAM usage optimization By default, torch uses Float32 precision while running on CPU, which leads, for example, to use 44 GB of RAM for 7B model. We may use Bfloat16 precision on CPU too, which decreases RAM consumption/2, down to 22 GB for 7B model, but inference processing much slower.