Evaluating an Expert Advisor properly requires moving through four distinct phases, each designed to answer a specific question. Skipping phases — which most traders do — is how you end up deploying strategies that looked great on paper and fail in practice.
This framework applies whether you're evaluating an EA you built yourself, one you purchased, or one you're considering from a third-party vendor.
Phase 1: Backtest Quality Assessment
Before you evaluate the strategy's performance, you need to verify that the backtest results are credible. A high-quality backtest on a poor strategy tells you something useful. A low-quality backtest on any strategy tells you nothing.
-
FAIL = STOPModeling quality ≥ 90%MT4/MT5 reports this in the backtest. Below 90%, tick data reconstruction is poor and results are unreliable.
-
FAIL = STOPMinimum 200 tradesBelow 100 trades, statistical significance is too low to draw meaningful conclusions.
-
WARNTest period covers at least 2–3 yearsShort test periods may capture only one market regime. Multiple years ensure exposure to varying conditions.
-
WARNRealistic spread settings usedVerify the spread used in the backtest matches your broker's actual typical spread, not an optimistic fixed value.
Phase 2: Performance Metrics Review
With a credible backtest in hand, evaluate the core performance metrics. These are the numbers that predict whether the strategy has a repeatable edge.
-
KEY METRICProfit Factor between 1.4 and 2.5Below 1.4 = thin margin of safety. Above 2.5 = likely overfitting.
-
FAIL = STOPPositive expectancy per tradeAverage expected gain per trade after costs. Must be positive. If negative, no position sizing can fix the strategy.
-
FAIL = STOPMax drawdown × 2.5 is survivableLive drawdown typically exceeds backtest by 1.5–3×. Can you psychologically and financially handle 2.5× the backtest max DD?
-
WARN IF LOWRecovery Factor above 2.0Net profit divided by max drawdown. Measures how efficiently the strategy generates returns relative to risk taken.
-
WARNPerformance is consistent across monthsBreak down results by month. If most of the profit came from 1–2 exceptional months, the edge may not be repeatable.
Phase 3: Robustness Testing
A strategy that passes the performance review needs to prove it isn't just curve-fitted to its test data. Robustness testing applies stress to the strategy to see how it holds up.
-
CRITICALOut-of-sample test passesOOS profit factor should be at least 70% of in-sample PF. Test on data the strategy never saw during optimization.
-
IMPORTANTSpread stress test: PF still positive at 2× spreadRe-run with double the original spread assumption. A strategy that breaks on minor spread increases has thin edge.
-
IMPORTANTParameter stability: small changes don't collapse resultsVary key parameters by ±10–20%. Results should degrade gradually, not catastrophically.
-
RECOMMENDEDMonte Carlo worst-case drawdown is acceptableRun Monte Carlo simulation. If the 5th percentile worst-case drawdown exceeds your tolerance, reduce position size accordingly.
Phase 4: Forward Testing & Live Deployment
A strategy that passes all three previous phases has earned the right to be forward tested. This phase is not about validation anymore — it's about calibration to real execution conditions before full capital is committed.
-
STEP 1Forward test on demo for 4–8 weeks minimumDemo testing confirms the EA executes correctly and trades in line with backtest behavior. Minimum 30–50 trades if possible.
-
STEP 2Go live at 25% of intended position sizeThe first 2–3 months live is a calibration phase. Verify live performance matches forward test and backtest expectations before scaling.
-
ESSENTIALSet a defined drawdown stop rule in advanceDecide before going live: "If drawdown exceeds X%, I stop and reassess for Y days." Write it down. The decision should not be made in the moment.
-
DISCIPLINEScale up only after 3+ months of consistent live resultsDon't scale position size until live performance has confirmed the backtest expectations across multiple market conditions.
Phase 1 filters bad data. Phase 2 filters bad strategies. Phase 3 filters lucky strategies. Phase 4 calibrates good strategies to reality. Most traders skip directly to Phase 4 after Phase 2 — which is why most deployments underperform their backtests.
One Final Check: Can You Explain the Edge?
Before committing any capital, ask yourself: in two sentences, what market behavior or inefficiency does this strategy exploit? If the answer is "the parameters that optimized best happened to be these values," that's not an explanation — it's a description of data mining.
A strategy with genuine edge has a logical reason to work: it captures trend continuation after breakouts, it exploits mean reversion during low-volatility sessions, it takes advantage of consistent price behavior around key news events. The logic doesn't guarantee the strategy will work — but the absence of logic is a genuine red flag.
The right way to evaluate an EA is thorough, sometimes tedious, and frequently results in rejecting strategies you wanted to like. That's exactly the point. The process is designed to be harder to pass than most strategies deserve to pass — because the cost of deploying a genuinely bad strategy with real money is always higher than the cost of extra caution.
Start With Phase 1 Right Now
EA Analyzer Pro extracts modeling quality, trade count, drawdown, profit factor, and expectancy from your MT4/MT5 backtest report in seconds — free, no install needed.
Open EA Analyzer Pro →