-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
Hi,
I’d like to report a potential logic issue in the "Generate Signals" section of 02_vectorized_backtest.ipynb, which is part of Chapter 8: The ML4T Workflow - From Model to Strategy Backtesting.
Location
Notebook: 08_ml4t_workflow/02_vectorized_backtest.ipynb
Section: Generate Signals
Current Code
long_signals = ((predictions .where(predictions > 0) .rank(axis=1, ascending=False) > N_LONG) .astype(int))
short_signals = ((predictions .where(predictions < 0) .rank(axis=1) > N_SHORT) .astype(int))
Issue
The current logic generates a signal of 1 for assets ranked below the top N, rather than for the top N predictions. However, based on the context and subsequent backtest logic, a 1 in the signal matrix should indicate a trade signal (i.e., to long/short that asset).
In descending ranking (ascending=False), smaller rank values correspond to higher predictions. Therefore, the condition > N_LONG incorrectly selects the lower-ranked (i.e., less bullish) assets.
Suggested Fix
To select the top N long signals (i.e., assets with highest predicted returns), the logic should be:
long_signals = ((predictions .where(predictions > 0) .rank(axis=1, ascending=False) <= N_LONG) .astype(int))
Similarly, for short signals (lowest predictions):
short_signals = ((predictions .where(predictions < 0) .rank(axis=1, ascending=True) <= N_SHORT) .astype(int))
Why It Matters
Downstream code such as:
long_returns = long_signals.mul(fwd_returns).mean(axis=1)
short_returns = short_signals.mul(-fwd_returns).mean(axis=1)
assumes that a 1 in the signal matrix corresponds to an active long/short position. The current logic would instead assign 1 to lower-ranked predictions, which may significantly degrade the backtest result or produce misleading performance statistics.
The same logic appears in the paperback book as well.
Thanks again for this excellent open-source resource! This clarification could help future readers better align signal generation with the intended trading logic.
Please let me know your thoughts on this.
Best,
Saining Zhang