Commit Graph

66 Commits (mmap)
 

Author SHA1 Message Date
Justine Tunney 0b5448a3a4
Implement system polyfill for win32 / posix.1
I don't have access to Microsoft Visual Studio right now (aside from the
the Github Actions CI system) but I think this code should come close to
what we want in terms of polyfilling UNIX functionality.
3 years ago
Justine Tunney 5b8023d935
Implement prototype for instant mmap() loading
This change uses a custom malloc() implementation to transactionally
capture to a file dynamic memory created during the loading process.
That includes (1) the malloc() allocation for mem_buffer and (2) all
the C++ STL objects. On my $1000 personal computer, this change lets
me run ./main to generate a single token (-n 1) using the float16 7B
model (~12gb size) in one second. In order to do that, there's a one
time cost where a 13gb file needs to be generated. This change rocks
but it shouldn't be necessary to do something this heroic. We should
instead change the file format, so that tensors don't need reshaping
and realignment in order to be loaded.
3 years ago
Justine Tunney 2788f373be
Get the build working 3 years ago
Ronsor 47857e564c
Don't use vdotq_s32 if it's not available (#139)
* Don't use vdotq_s32 if it's not available

`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.

Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.

* Update ggml.c

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
3 years ago
Radoslav Gerganov 60f819a2b1
Add section to README on how to run the project on Android (#130) 3 years ago
Georgi Gerganov 97ab2b2578
Add Misc section + update hot topics + minor fixes 3 years ago
Sebastián A 2f700a2738
Add windows to the CI (#98) 3 years ago
Georgi Gerganov c09a9cfb06
CMake build in Release by default (#75) 3 years ago
Georgi Gerganov 7ec903d3c1
Update contribution section, hot topics, limitations, etc. 3 years ago
Georgi Gerganov 4497ad819c
Print system information 3 years ago
Sebastián A ed6849cc07
Initial support for CMake (#75) 3 years ago
Thomas Klausner 41be0a3b3d
Add NetBSD support. (#90) 3 years ago
Pavol Rusnak 671d5cac15
Use fprintf for diagnostic output (#48)
keep printf only for printing model output

one can now use ./main ... 2>dev/null to suppress any diagnostic output
3 years ago
Georgi Gerganov 84d9015c4a
Use vdotq_s32 to improve performance (#67)
* 10% performance boost on ARM

* Back to original change
3 years ago
uint256_t 63fd76fbb0
Reduce model loading time (#43)
* Use buffering

* Use vector

* Minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
3 years ago
Val Kharitonov 2a20f48efa
Fix UTF-8 handling (including colors) (#79) 3 years ago
Pavol Rusnak d1f224712d
Add quantize script for batch quantization (#92)
* Add quantize script for batch quantization

* Indentation

* README for new quantize.sh

* Fix script name

* Fix file list on Mac OS

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
3 years ago
Georgi Gerganov 1808ee0500
Add initial contribution guidelines 3 years ago
Matvey Soloviev a169bb889c Gate signal support on being on a unixoid system. (#74) 3 years ago
Matvey Soloviev 460c482540 Fix token count accounting 3 years ago
Georgi Gerganov c80e2a8f2a
Revert "10% performance boost on ARM"
This reverts commit 113a9e83eb.

There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
3 years ago
Georgi Gerganov 54a0e66ea0
Check for vdotq_s32 availability 3 years ago
Georgi Gerganov 543c57e991
Ammend to previous commit - forgot to update non-QRDMX branch 3 years ago
Georgi Gerganov 113a9e83eb
10% performance boost on ARM 3 years ago
Matvey Soloviev 404fac0d62
Fix color getting reset before prompt output done (#65)
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
3 years ago
Georgi Gerganov 1a0a74300f
Update README.md 3 years ago
Matvey Soloviev 96ea727f47
Add interactive mode (#61)
* Initial work on interactive mode.

* Improve interactive mode. Make rev. prompt optional.

* Update README to explain interactive mode.

* Fix OS X build
3 years ago
Marc Köhlbrugge 9661954835
Fix typo in README (#45) 3 years ago
Ben Garney f385f8dee8
Allow using prompt files (#59) 3 years ago
beiller 02f0c6fe7f
Add back top_k (#56)
* Add back top_k

* Update utils.cpp

* Update utils.h

---------

Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
3 years ago
Sebastián A eb062bb012
Windows fixes (#31)
* Apply fixes suggested to build on windows

Issue: https://github.com/ggerganov/llama.cpp/issues/22

* Remove unsupported VLAs

* MSVC: Remove features that are only available on MSVC C++20.

* Fix zero initialization of the other fields.

* Change the use of vector for stack allocations.
3 years ago
Georgi Gerganov 7027a97837
Update README.md 3 years ago
Georgi Gerganov 2d555e5b42
Add CI (#60) 3 years ago
Georgi Gerganov 7c9e54e55e
Revert "weights_only" arg - this causing more trouble than help 3 years ago
Oleksandr Nikitin b9bd1d0141
python/pytorch compat notes (#44) 3 years ago
beiller 129c7d1ea8
Add repetition penalty (#20)
* Adding repeat penalization

* Update utils.h

* Update utils.cpp

* Numeric fix

Should probably still scale by temp even if penalized

* Update comments, more proper application

I see that numbers can go negative so a fix from a referenced commit

* Minor formatting

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
3 years ago
Georgi Gerganov 702fddf5c5
Clarify meaning of hacking 3 years ago
Georgi Gerganov 7d86e25bf6
README: add "Supported platforms" + update hot topics 3 years ago
deepdiffuser a93120236f
use weights_only in conversion script (#32)
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
3 years ago
Pavol Rusnak 6a9a67f0be
Add LICENSE (#21) 3 years ago
Georgi Gerganov da1a4ff01f
Update README.md 3 years ago
Juraj Bednar 6b2cb6302f
Fix a typo in model name (#16) 3 years ago
Georgi Gerganov 4235e3d5b3
Update README.md 3 years ago
Georgi Gerganov f1eaff4721 Add AVX2 support for x86 architectures thanks to @Const-me ! 3 years ago
Georgi Gerganov a9e58529ea Fix un-initialized FP16 tables on x86 (#15, #2) 3 years ago
Georgi Gerganov 7d9ed7b25f
Bump memory buffer 3 years ago
Georgi Gerganov 0c6803321c
Update README.md 3 years ago
Georgi Gerganov f60fa9e50a
.gitignore models/ 3 years ago
Georgi Gerganov 7211862c94
Update Makefile var + add comment 3 years ago
Georgi Gerganov a5c5ae2f54
Update README.md 3 years ago