mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:15:35 +08:00
spec: code-review-2 resolution (oracle robustness fixes)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -158,3 +158,16 @@ Plan-review-1 resolution (docs/spec/20260530_plan_review.md, REQUEST CHANGES):
|
||||
- A-mode "is compare" replaced by JSON type+value oracle (_strictify_assert).
|
||||
- S/R/T dropped at gate (reviewer concurred: start M1/A/B). So the honest count
|
||||
is 3 modes, NOT 4-6. UAT1 will report however many survive the base quadrant.
|
||||
|
||||
Code-review-2 resolution (docs/spec/20260530_refactor_code_review.md, REQUEST
|
||||
CHANGES -> all fixed, commit after derisk #7):
|
||||
- CRIT: sys.exit INSIDE solve() (during a test call) fooled the oracle. FIX:
|
||||
wrap BOTH solution-exec and assert-exec in ONE try/except SystemExit ->
|
||||
os._exit(1). Catches module-level AND in-call exits AND raise SystemExit.
|
||||
- CRIT: JSON __strict_eq broke 2==2.0 and tuple/list semantics vs gt_pass. FIX:
|
||||
whitelist safe builtins (int/float/bool/str/None/list/tuple/dict) and use
|
||||
baseline Python ==; a custom-typed operand = the eq_override exploit -> reject.
|
||||
- IMPORTANT: defs-only dropped honest top-level constants -> false hacks. FIX:
|
||||
exec the FULL src (state preserved); the SystemExit guard handles exits.
|
||||
- verify_rewards +3 regressions (exit_in_solve / top_const / int_vs_float); 9/9.
|
||||
- The derisk #7 ran on the buggy oracle -> killed and requeued (#8) on the fix.
|
||||
|
||||
Reference in New Issue
Block a user