| Sample window | Start date, end date, market, timeframe, and rule version are recorded. | Old and new results get mixed together. | Split the sample or restart the benchmark. |
| Trade inventory | Entries, exits, skips, missed trades, and exclusions are counted. | The sample overweights active trades and hides restraint. | Add missing records before judging the result. |
| Rule fit | Each paper entry maps to a written setup and invalidation. | Outcome bias makes weak decisions look planned. | Tag rule breaks and review them separately. |
| Risk behavior | Paper size, max drawdown, exposure, and stop distance stayed inside plan. | Positive paper PnL hides unsafe process drift. | Reduce simulated size or tighten controls. |
| Outlier review | The largest winner and loser are isolated and explained. | One unusual result controls the conclusion. | Report normalized and raw views. |
| Market context | Trend, volatility, news, liquidity, and broad regime are labeled. | The rule looks stronger than it is in one favorable regime. | Collect more context or segment the benchmark. |
| Journal completeness | Another reviewer can understand thesis, trigger, invalidation, and exit reason. | The review depends on memory instead of evidence. | Improve journal fields before changing rules. |
| One next action | The review chooses one change, continuation, or stop decision. | The next sample cannot be compared fairly. | Document one owner and one follow-up date. |