Skip to content

Fix activation docstring math in log_softmax and gelu_approx#3590

Open
adityasingh2400 wants to merge 1 commit into
ml-explore:mainfrom
adityasingh2400:fix-activation-docstring-math
Open

Fix activation docstring math in log_softmax and gelu_approx#3590
adityasingh2400 wants to merge 1 commit into
ml-explore:mainfrom
adityasingh2400:fix-activation-docstring-math

Conversation

@adityasingh2400
Copy link
Copy Markdown
Contributor

A couple of the activation docstrings in mlx.nn have incorrect math.

log_softmax documents the formula as $x + \log \sum_i e^{x_i}$, but the implementation subtracts the log-sum-exp (x - mx.logsumexp(...)). The documented sign is wrong. This corrects it to $x - \log \sum_i e^{x_i}$, which matches both the code and the definition of $\log(\text{softmax}(x))$.

gelu_approx has an extra opening parenthesis immediately after \text{Tanh} in its math block, so the LaTeX delimiters do not balance (three \left( against three \right) plus one stray bare () and the expression renders incorrectly. The same block is duplicated in the GELU class docstring. Removing the stray parenthesis makes the grouping match the implementation 0.5 * x * (1 + tanh(sqrt(2 / pi) * (x + 0.044715 * x**3))).

These are documentation-only changes, so there is no effect on runtime behavior. I verified the corrected log_softmax formula against log(softmax(x)) numerically, confirmed the GELU LaTeX now balances and matches the code, and ran black/isort plus the activation tests in python/tests/test_nn.py.

The log_softmax docstring rendered the formula as x + log sum exp(x_i),
but the function subtracts the log-sum-exp, so the sign was wrong. Correct
it to x - log sum exp(x_i), which matches both the implementation and the
definition of log(softmax(x)).

The gelu_approx math block (and the duplicate in the GELU class docstring)
had an extra opening parenthesis right after Tanh, leaving the LaTeX
delimiters unbalanced so the expression rendered incorrectly. Remove the
stray parenthesis so the grouping matches the implementation.

These are documentation-only changes with no effect on runtime behavior.

Signed-off-by: Aditya Singh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant