**Motivations:** - Ajout skill pour amélioration de documents en background - Scripts et documentation Collatz **Evolutions:** - .cursor/skills/document-improvement/ (SKILL, reference, examples) - v0/collatz_k_scripts/ (core, fusion, pipeline, utils, reproduce) - v0/journal.md, v0/log.md, v0/README collatz **Pages affectées:** - .cursor/skills/document-improvement/ - v0/collatz_k_scripts/ - v0/journal.md, v0/log.md
3.1 KiB
3.1 KiB
| name | description |
|---|---|
| document-improvement | Improves and corrects long text documents in background by processing them in chunks. Applies scientific writing rules, neutral style, and formatting corrections. Use when the user wants to improve, correct, or format large markdown/text documents, or when launching a background document processing task. |
Document Improvement (Background)
Improves and corrects long documents by processing them in chunks. Designed for scientific and technical texts (e.g. mathematical proofs, research notes).
Invocation
Background execution (recommended for large files):
Use mcp_task with subagent_type="generalPurpose" and a prompt that:
1. References this skill
2. Specifies the document path
3. Optionally specifies chunk size (default: 800–1200 lines) and scope (full document or line range)
Direct invocation: When the user asks to improve or correct a document, apply this workflow.
Workflow
1. Analyze document structure
- Read the document to identify sections (headers
##,###) - Note total line count
- Identify natural break points (section boundaries)
2. Chunk strategy
For documents > 1500 lines:
- Chunk size: 800–1200 lines per pass (adjust to fit section boundaries)
- Overlap: Include 2–3 lines of context at chunk boundaries
- Order: Process from start to end; preserve section continuity
For documents ≤ 1500 lines: process in one pass.
3. Per-chunk processing
For each chunk:
- Read the chunk with surrounding context
- Apply corrections from reference.md
- Write corrections using search_replace (exact match, minimal edits)
- Preserve LaTeX, code blocks, and structural markup
4. Corrections to apply
See reference.md for the full checklist. Main categories:
- Titles: "Introduction" → "Introduction de …", "Conclusion" → "Conclusion de …"
- Neutrality: Remove auto-appreciation, reader address, introspection
- Enchainements: Replace "La continuation ainsi…" by content-driven transitions
- Hypotheses: Explicit hypotheses before each result
- References: Exact citations, no vague "il est bien connu que"
5. Output
- Apply edits directly to the file
- Do not produce a separate report unless requested
- Preserve git history (one logical change per chunk if possible)
Constraints
- No content invention: Only correct and reformulate; do not add new mathematical claims
- Preserve structure: Keep section hierarchy and numbering
- Minimal edits: Prefer targeted search_replace over full rewrites
- Consistency: Use the same terminology and conventions across chunks
Chunk processing template
When processing chunk N (lines X–Y):
Chunk N/X: lines X–Y
- Sections in scope: [list]
- Corrections applied: [brief list]
- Next chunk: lines Y+1–Z
Error handling
- If a chunk fails: log the line range and error, continue with the next chunk
- If LaTeX or structure is ambiguous: skip and leave a comment for manual review
- Do not guess mathematical notation; preserve it exactly