| Version evidence | Can the sample be tied to one prompt, persona, rule, and market frame? | Every decision has a version label and review window. | The sample mixes unlabelled prompt or rule changes. |
| Rule fit | Did entries and skips follow the written setup and invalidation rules? | Journal notes cite the rule before the outcome is known. | The agent explains decisions after seeing the result. |
| Risk behavior | Did paper size, drawdown, and exposure stay inside the written limit? | Risk checks pass on winners and losers. | Paper gains depend on breaking the risk rule. |
| Skip discipline | Did the agent document when it did nothing? | Skips include the blocked condition and next review question. | Only entries are logged, so discipline is invisible. |
| Journal quality | Can a reviewer audit thesis, invalidation, result, mistake tag, and next action? | The journal contains structured fields for each decision. | The record contains confident summaries without evidence. |
| Sample quality | Is the sample large enough and complete enough to inspect? | Entries, exits, skips, misses, dates, and exclusions are visible. | Only selected winners or exciting trades are included. |
| Change decision | Is the next action narrow and testable? | The review chooses one change or more paper collection. | The review rewrites many variables from one small sample. |