<th>Number of parameters</th><th>dimension</th><th>n heads</th><th>n layers</th><th>Learn rate</th><th>Batch size</th><th>n tokens</th>
@ -100,7 +100,7 @@ We present our results on eight standard common sense reasoning benchmarks in th
<table>
<thead>
<tr>
<th>LLaMa</th><thcolspan=9>Reasoning tasks </th>
<th>LLaMA</th><thcolspan=9>Reasoning tasks </th>
</tr>
<tr>
<th>Number of parameters</th><th>BoolQ</th><th>PIQA</th><th>SIQA</th><th>HellaSwag</th><th>WinoGrande</th><th>ARC-e</th><th>ARC-c</th><th>OBQA</th><th>COPA</th>