Nicolas Cantu 2b99e8ff02 Skills document-improvement et scripts Collatz

**Motivations:**
- Ajout skill pour amélioration de documents en background
- Scripts et documentation Collatz

**Evolutions:**
- .cursor/skills/document-improvement/ (SKILL, reference, examples)
- v0/collatz_k_scripts/ (core, fusion, pipeline, utils, reproduce)
- v0/journal.md, v0/log.md, v0/README collatz

**Pages affectées:**
- .cursor/skills/document-improvement/
- v0/collatz_k_scripts/
- v0/journal.md, v0/log.md

2026-02-27 16:23:25 +01:00

3.1 KiB

Raw Blame History

name	description
document-improvement	Improves and corrects long text documents in background by processing them in chunks. Applies scientific writing rules, neutral style, and formatting corrections. Use when the user wants to improve, correct, or format large markdown/text documents, or when launching a background document processing task.

Document Improvement (Background)

Improves and corrects long documents by processing them in chunks. Designed for scientific and technical texts (e.g. mathematical proofs, research notes).

Invocation

Background execution (recommended for large files):

Use mcp_task with subagent_type="generalPurpose" and a prompt that:
1. References this skill
2. Specifies the document path
3. Optionally specifies chunk size (default: 800–1200 lines) and scope (full document or line range)

Direct invocation: When the user asks to improve or correct a document, apply this workflow.

Workflow

1. Analyze document structure

Read the document to identify sections (headers ##, ###)
Note total line count
Identify natural break points (section boundaries)

2. Chunk strategy

For documents > 1500 lines:

Chunk size: 800–1200 lines per pass (adjust to fit section boundaries)
Overlap: Include 2–3 lines of context at chunk boundaries
Order: Process from start to end; preserve section continuity

For documents ≤ 1500 lines: process in one pass.

3. Per-chunk processing

For each chunk:

Read the chunk with surrounding context
Apply corrections from reference.md
Write corrections using search_replace (exact match, minimal edits)
Preserve LaTeX, code blocks, and structural markup

4. Corrections to apply

See reference.md for the full checklist. Main categories:

Titles: "Introduction" → "Introduction de …", "Conclusion" → "Conclusion de …"
Neutrality: Remove auto-appreciation, reader address, introspection
Enchainements: Replace "La continuation ainsi…" by content-driven transitions
Hypotheses: Explicit hypotheses before each result
References: Exact citations, no vague "il est bien connu que"

5. Output

Apply edits directly to the file
Do not produce a separate report unless requested
Preserve git history (one logical change per chunk if possible)

Constraints

No content invention: Only correct and reformulate; do not add new mathematical claims
Preserve structure: Keep section hierarchy and numbering
Minimal edits: Prefer targeted search_replace over full rewrites
Consistency: Use the same terminology and conventions across chunks

Chunk processing template

When processing chunk N (lines X–Y):

Chunk N/X: lines X–Y
- Sections in scope: [list]
- Corrections applied: [brief list]
- Next chunk: lines Y+1–Z

Error handling

If a chunk fails: log the line range and error, continue with the next chunk
If LaTeX or structure is ambiguous: skip and leave a comment for manual review
Do not guess mathematical notation; preserve it exactly

3.1 KiB Raw Blame History Unescape Escape