From 724a2fd1c8a3cf7c939f68ee6248461e9747c405 Mon Sep 17 00:00:00 2001 From: randaller Date: Sun, 19 Mar 2023 16:04:16 +0300 Subject: [PATCH] Update README.md --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 9cc23b9..009671b 100644 --- a/README.md +++ b/README.md @@ -195,7 +195,9 @@ torch.set_default_dtype(torch.bfloat16) device_map = infer_auto_device_map(model, max_memory={0: "6GiB", "cpu": "128GiB"}) ``` -One with A100 might try to set 38Gb to a GPU and try to inference the model completely in the GPU VRAM. +One with A100 might try to set 38Gb to a GPU0 and try to inference the model completely in the GPU VRAM. + +One with 4*A100 might wish to use: 0: "38GiB", 1: "38Gb" etc. For me, with 6Gb for 3070ti, this works three times slower against pure CPU inference.