Skip to content

fix: numerically unstable log-odds in ORPO loss#6407

Open
Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Mr-Neutr0n:fix/orpo-log-odds-numerical-stability
Open

fix: numerically unstable log-odds in ORPO loss#6407
Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Mr-Neutr0n:fix/orpo-log-odds-numerical-stability

Conversation

@Mr-Neutr0n
Copy link

Bug

The ORPO loss (OddsRatioLoss in applications/ColossalChat/coati/models/loss.py) computes log-odds using a numerically fragile pattern:

chosen_odds = chosen_logp - torch.log(-torch.exp(chosen_logp) + 1.0001)

This has two problems:

  1. Biased constant: The magic value 1.0001 shifts the result away from the mathematically correct value, introducing a systematic bias into the loss.
  2. NaN risk: When exp(logp) > 1.0001 (which can happen due to floating-point imprecision, especially in mixed-precision training), the argument to torch.log becomes negative, producing NaN and poisoning the training run.

The mathematically correct log-odds formula is log(p) - log(1-p) = logp - log(1 - exp(logp)).

Fix

  • Clamp logp to (-inf, -eps] so that exp(logp) is strictly less than 1, preventing both division by zero and negative arguments to log.
  • Replace torch.log(-torch.exp(logp) + 1.0001) with torch.log1p(-torch.exp(logp)), which is the numerically correct and unbiased way to compute log(1 - exp(logp)).

This eliminates both the NaN risk and the systematic bias from the hardcoded offset.

@Mr-Neutr0n Mr-Neutr0n requested a review from a team as a code owner February 11, 2026 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant