10 Backtesting Mistakes That Will Wreck Your Portfolio

📖 ~1,800 words · Intermediate to Advanced · Updated 2026-05-20

Backtesting is the discipline of testing a trading strategy on historical data to see how it would have performed. Done correctly, it's the single most powerful tool a quantitative trader has. Done incorrectly — which is how most retail traders do it — it's a recipe for catastrophic real-money losses.

Every backtested strategy looks good. That's the trap. The hard work is figuring out whether the historical performance was real edge or pure statistical artifact. Below are the ten most expensive mistakes, ranked by frequency in retail trading systems.

1. Lookahead Bias

The most common, most damaging mistake. You accidentally use information in your decision logic that wasn't available at the time of the trade.

Example: Your strategy buys stocks that "closed above the 200-day SMA." But you compute SMA using today's close, and then "decide" to buy at today's open. You used future information (today's close) to make a decision yesterday.

✅ Fix: Lag your signal by one bar. Compute SMA using yesterday's close. Decide to buy. Execute at today's open. Walking through a strict timeline is essential.

2. Survivorship Bias

Your historical dataset only contains companies that still exist today. All the bankruptcies, delistings, and acquisitions are missing.

Example: You backtest "buy low P/E stocks in the S&P 500" using today's S&P 500 constituents. But Lehman Brothers, Bear Stearns, Enron, and Sears aren't in there. The system looks fantastic in 2008-2009 because every catastrophic failure is invisible.

✅ Fix: Use a point-in-time universe — the actual stocks that were in the index at that historical date. CRSP and Compustat databases provide this. If you can't afford that, at least add a 10-30% haircut to bullish backtests.

3. Overfitting (Curve-Fitting)

You tune your strategy parameters until they produce great historical results. The numbers are perfect for the past. The strategy has zero predictive power for the future.

Example: You test RSI(14, 30, 70) — okay results. So you try RSI(11, 28, 72) — better. RSI(13, 27, 73) — even better. After 50 tweaks, you find RSI(9, 24.5, 71.2) gives 40% annual return. It's overfit nonsense.

✅ Fix: (1) Use walk-forward analysis: train on years 1-3, test on year 4 (out-of-sample). Move forward. (2) Reduce parameter count — fewer knobs to twiddle. (3) If results are highly sensitive to tiny parameter changes, you're overfitting.

4. Ignoring Transaction Costs

Real trading costs money. Commissions, bid-ask spread, slippage on large orders. A strategy that trades 50 times a month at 0.05% per trade loses 2.5% per month — 30% annually.

Example: You backtest a fast mean-reversion strategy that trades 100 times a year. Backtest shows +15% annual. After realistic 0.1% per round-trip costs, you're at +5%. After slippage in volatile markets, +1%. Probably below SPY.

✅ Fix: Always model transaction costs. Add at minimum 5-10 basis points per trade for liquid US large caps; more for small caps or international. Test how sensitive your edge is to cost assumptions.

5. Look-Ahead in Fundamentals (Restated Data)

Quarterly earnings reports are often restated months later as auditors catch errors. Databases store the final restated number. Your backtest "sees" the corrected figure on the original earnings date.

✅ Fix: Use "point-in-time" fundamentals. Provider tags: PIT data. If unavailable, lag earnings data by at least 3 months to be safe.

6. Ignoring Position Sizing and Capital

Your backtest assumes you always trade the same dollar amount. In reality, returns compound. A 50% drawdown in year 1 leaves you 50% smaller forever.

Example: Strategy returns +100%, -50%, +100%, -50%. Average return appears +25%. Actual compounded result: 100 × 2 × 0.5 × 2 × 0.5 = $100. Zero return.

✅ Fix: Report geometric (compound) returns, not arithmetic averages. Report max drawdown alongside total return. Use Kelly Criterion or fractional Kelly for position sizing.

7. Selection Bias in Your Universe

You backtest on "tech stocks" or "high beta names" because they had great historical runs. The selection of the universe itself contains hindsight.

Example: "AI stocks" defined retrospectively as winners (NVDA, MSFT, GOOGL). Of course they performed well — they're the survivors of a much larger group of "AI plays."

✅ Fix: Define your universe by rule, not by name. E.g., "all stocks with market cap > $1B in the technology sector as of date X." Apply the same rule consistently across time.

8. Ignoring Regime Changes

Markets behave fundamentally differently in different regimes. A strategy that worked in 2015-2019 (low-vol bull market) may fail in 2022 (high-vol bear) or 2020 (regime transition).

✅ Fix: Test your strategy in multiple distinct regimes: bull (2017), bear (2008, 2022), high-vol (2020), low-vol (2017). If it only works in one regime, it's brittle. Hidden Markov Models can help detect regime shifts.

9. Data Snooping / P-Hacking

You test 1,000 strategies. About 50 will look great by random chance (5% false positive rate). You pick the best one and claim "discovered alpha."

Example: Anyone who runs enough Monte Carlo simulations will find a "strategy" that returns 80% historically — by sheer coincidence.

✅ Fix: Use multiple-comparison adjustments (Bonferroni). Insist on out-of-sample validation. Better: have an economic thesis before testing, not after.

10. Trusting a Single Backtest

One backtest run is one realization of the future. Markets are stochastic — your specific historical period was somewhat random.

✅ Fix: Use bootstrap resampling or Monte Carlo simulation. Generate 1,000 alternative histories that share the same statistical properties. Look at the distribution of outcomes — not just the median, but the 5th and 95th percentile. If a strategy returns 30% on average but its 5th percentile is -40%, that's risky.

The Backtest Checklist Before Going Live

Before risking real money on any backtested strategy:

Has it been tested with walk-forward (not just in-sample) data?
Are realistic transaction costs included?
Is the universe selection rule-based, applied point-in-time?
Does it work across at least 2 distinct market regimes?
Is max drawdown manageable for your psychology?
Have you tested with different starting dates (path dependency)?
Have you adjusted for the number of strategies tested (multiple-comparisons)?
Do you have an economic reason the edge exists? (Behavioral, structural, informational, etc.)
Have you paper-traded it for at least 3 months matching live conditions?
Have you sized positions so a 50% drawdown wouldn't destroy you mentally or financially?

If you can't check off at least 8 of 10, don't risk significant capital.

Try Robust Backtesting

10X Rock's Backtester implements SMA Cross and RSI Mean Reversion strategies against historical data with standardized transaction-cost assumptions. Use it as a starting point — and then run the strategy on out-of-sample data before committing real money.

Try the Backtester →