From 3fbdffb6c67a163c7bbcce429e3ee19d811a75c2 Mon Sep 17 00:00:00 2001 From: PENG Bo <33809201+BlinkDL@users.noreply.github.com> Date: Wed, 13 Apr 2022 14:39:34 +0800 Subject: [PATCH] Update README.md --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index 9314642..435355c 100644 --- a/README.md +++ b/README.md @@ -28,6 +28,12 @@ The pseudocode (execution from top to bottom): ![RWKV-v2-RNN](RWKV-v2-RNN.png) +# Better Learning Rate Schedule via Variantional Method of Loss Curve + +I propose a simple new method to find better LR schedules. The method is cost-efficient and practical for large LMs. The takeaway is we can model the loss curve dynamics (phenomenology) w.r.t. the LR, and a nice closed-form LR curve can be directly computed from it using variantional method. Moreover we can predict the final loss with reasonable accuracy. + +![better_lr_schedule](Research/better_lr_schedule.png) + # The top-p-x sampling method We propose a new sampling method called top-p-x: