Clarify that the OpenCL implementations all use f16.

3 years ago · 44e0abf0f1
parent 58463458ee
commit 44e0abf0f1
1 changed files with 4 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -7,12 +7,16 @@ https://github.com/ggerganov/ggml that could run GPT-J 6B models.
 The current performance is as follows:

 ```
+Pure Rust implementations:
+
 LLaMA-7B:  AMD Ryzen 3950X:                       552ms / token     f16    (pure Rust)
 LLaMA-7B:  AMD Ryzen 3950X:                       1008ms / token    f32    (pure Rust)
 LLaMA-13B: AMD Ryzen 3950X:                       1029ms / token    f16    (pure Rust)
 LLaMA-13B: AMD Ryzen 3950X:                       1930ms / token    f32    (pure Rust)
 LLaMA-30B: AMD Ryzen 5950X:                       2112ms / token    f16    (pure Rust)

+OpenCL (all use f16):
+
 LLaMA-7B:  AMD Ryzen 3950X + OpenCL GTX 3090 Ti:  247ms / token            (OpenCL on GPU)
 LLaMA-7B:  AMD Ryzen 3950X + OpenCL Ryzen 3950X:  680ms / token            (OpenCL on CPU)
 LLaMA-13B: AMD Ryzen 3950X + OpenCL GTX 3090 Ti:  <I ran out of GPU memory :(>