Update README.md

main
PENG Bo 3 years ago committed by GitHub
parent 79aa59ff2b
commit 9721b8f9c5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -108,22 +108,14 @@ Here is a great prompt for testing Q&A of LLMs. Works for any model: (found by m
prompt = f'\nQ & A\n\nQuestion:\n{qq}\n\nDetailed Expert Answer:\n' # let the model generate after this
```
### Inference
**Run RWKV-4 Pile models:** Download models from https://huggingface.co/BlinkDL. Set TOKEN_MODE = 'pile' in run.py and run it. It's fast even on CPU (the default mode).
**Colab for RWKV-4 Pile 1.5B**: https://colab.research.google.com/drive/1F7tZoPZaWJf1fsCmZ5tjw6sYHiFOYVWM
Run RWKV-4 Pile models in your browser (and onnx version): see this issue https://github.com/BlinkDL/RWKV-LM/issues/7
RWKV-4 Web Demo: https://josephrocca.github.io/rwkv-v4-web/demo/ (note: only greedy sampling for now)
**More resources**:
**Cool Community RWKV Projects (check them!)**:
https://pypi.org/project/rwkvstic/
https://github.com/harrisonvanderbyl/rwkv_chatbot
https://github.com/mrsteyk/RWKV-LM-deepspeed
https://github.com/huggingface/transformers/issues/17230
https://github.com/ArEnSc/Production-RWKV
@ -132,12 +124,24 @@ https://github.com/nlpodyssey/verbaflow (in Go)
https://github.com/nlpodyssey/rwkv (in Go)
https://github.com/resloved/RWKV-notebooks
https://github.com/Pathos14489/RWKVDistributedInference
https://github.com/AXKuhta/rwkv-onnx-dml
https://github.com/josephrocca/rwkv-v4-web
### Inference
**Run RWKV-4 Pile models:** Download models from https://huggingface.co/BlinkDL. Set TOKEN_MODE = 'pile' in run.py and run it. It's fast even on CPU (the default mode).
**Colab for RWKV-4 Pile 1.5B**: https://colab.research.google.com/drive/1F7tZoPZaWJf1fsCmZ5tjw6sYHiFOYVWM
Run RWKV-4 Pile models in your browser (and onnx version): see this issue https://github.com/BlinkDL/RWKV-LM/issues/7
RWKV-4 Web Demo: https://josephrocca.github.io/rwkv-v4-web/demo/ (note: only greedy sampling for now)
For the old RWKV-2: see the release here for a 27M params model on enwik8 with 0.72 BPC(dev). Run run.py in https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v2-RNN. You can even run it in your browser: https://github.com/BlinkDL/AI-Writer/tree/main/docs/eng https://blinkdl.github.io/AI-Writer/eng/ (this is using tf.js WASM single-thread mode).
I'd like to build an almost-INT8 version of RWKV. A simple method to quantize a matrix with outliers:

Loading…
Cancel
Save