| Sample boundary | Was the sample closed by a planned date or decision count? | Start date, end date, version, and inclusion rule are visible. | Close the sample and label exclusions before review. |
| Prompt version | Can every decision be tied to one prompt and rule set? | Rows include the same version label. | Split mixed rows into separate samples. |
| Entry evidence | Do entries name setup, thesis, invalidation, and paper risk? | Each entry can be reviewed before outcome bias. | Tighten the output format before another sample. |
| Skip evidence | Do skipped trades name the blocked condition? | Skips cite unclear data, risk, invalidation, exposure, or setup mismatch. | Add a required skip-reason field. |
| Risk behavior | Did the agent respect paper size, exposure, drawdown, and pause limits? | Risk limits pass on entries and skips. | Run the risk control workflow. |
| Outcome review | Are winners and losers judged by rule fit before PnL? | The review separates process quality from simulated result. | Rewrite the review note before changing rules. |
| Next action | Is the next action narrow, paper-first, and testable? | One change, more collection, lower size, or retirement. | Reject broad prompt rewrites from thin evidence. |