diff --git a/README.md b/README.md index 37e6b35..3d22b0a 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ alt="\begin{align*} * The Time-mix is similar to AFT (https://arxiv.org/abs/2105.14103). There are two differences. -(1) We changed the softmax normalization. For masked language models, we define: +(1) We changed the normalization (denominator). For masked language models, we define: