**Motivations:** - Make certificates reproducible when CSV columns do not encode the palier - Avoid FileNotFoundError when writing certificates into new folders - Reuse scission in the local H6 generator to avoid duplicated certificate logic **Root causes:** - palier inference relied on max residue value when the class column was generic - scission assumed output directories already exist - empty CSV fields were coerced to 0 **Correctifs:** - Infer palier from explicit columns (palier/m) or filename, keep heuristic fallback - Create parent directory for output JSON - Skip empty class/sister values instead of adding residue 0 **Evolutions:** - Use collatz_scission for certificate generation in local H6 artefacts generator **Pages affectées:** - applications/collatz/collatz_k_scripts/collatz_scission.py - applications/collatz/collatz_k_scripts/collatz_generate_local_h6_artefacts.py - docs/fixKnowledge/collatz_scission_palier_inference_and_output_dirs.md
2.2 KiB
2.2 KiB
collatz_scission palier inference and output directory creation
Problem
The helper script applications/collatz/collatz_k_scripts/collatz_scission.py can produce incorrect certificates in two cases:
- Incorrect
palierinference: when the CSV class column is named generically (e.g.classe_mod_2^m) and the covered classes are sparse / small (e.g. values<2^8while the target modulus is2^{13}),palierwas inferred from the maximum class value. This yields a wrong modulus power. - Missing output directories:
run_scission()writesout_json_pathwithout creating parent directories, which can raiseFileNotFoundErrorwhen callers pass a new path under a non-existing folder.
Root cause
infer_palier()only supported:- parsing
2^mfrom the class column name, or - a fallback heuristic based on the maximum covered residue value. This heuristic is not reliable when the class column name does not encode the modulus power.
- parsing
run_scission()assumed the output directory exists.
Corrective actions
- Prefer explicit palier columns:
- If the CSV contains a numeric
paliercolumn, use it. - If the CSV contains a numeric
m/modulus_powercolumn (used as exponent in some pipelines), use it.
- If the CSV contains a numeric
- Fallback from filename: parse
palier2p<m>from the CSV path when available. - Keep legacy fallback: keep the max-value heuristic as a last resort.
- Create output directories: ensure
out_json_path.parentexists before writing. - Do not add spurious residue 0: skip empty strings instead of coercing to 0 when parsing the class / sister columns.
Impact
- Certificates generated via
collatz_scission.pynow carry apalierthat matches the CSV’s intended modulus power when the CSV provides it (or when the filename encodes it). - Callers can write certificates to new directories without pre-creating them.
Analysis modalities
- For any certificate JSON, verify:
paliermatches the intended modulus power2^m,clausesandcoveredsets do not contain a spurious0,- directory creation does not fail when writing under a fresh path.
Deployment
- No environment changes are required.
- The fix is local to:
applications/collatz/collatz_k_scripts/collatz_scission.py