smart_ide/docs/repo/script-anythingllm-pull-sync.md
Nicolas Cantu 58cc2493e5 chore: consolidate ia_dev module, sync tooling, and harden gateways (0.0.5)
Initial state:
- ia_dev was historically referenced as ./ia_dev in docs and integrations, while the vendored module lives under services/ia_dev.
- AnythingLLM sync and hook installation had error masking / weak exit signaling.
- Proxy layers did not validate proxy path segments, allowing path normalization tricks.

Motivation:
- Make the IDE-oriented workflow usable (sync -> act -> deploy/preview) with explicit errors.
- Reduce security footguns in proxying and script automation.

Resolution:
- Standardize IA_DEV_ROOT usage and documentation to services/ia_dev.
- Add SSH remote data mirroring + optional AnythingLLM ingestion.
- Extend AnythingLLM pull sync to support upload-all/prefix and fail on upload errors.
- Harden smart-ide-sso-gateway and smart-ide-global-api proxying with safe-path checks and non-leaking error responses.
- Improve ia-dev-gateway runner validation and reduce sensitive path leakage.
- Add site scaffold tool (Vite/React) with OIDC + chat via sso-gateway -> orchestrator.

Root cause:
- Historical layout changes (submodule -> vendored tree) and missing central contracts for path resolution.
- Missing validation for proxy path traversal patterns.
- Overuse of silent fallbacks (|| true, exit 0 on partial failures) in automation scripts.

Impacted features:
- Project sync: git pull + AnythingLLM sync + remote data mirror ingestion.
- Site frontends: SSO gateway proxy and orchestrator intents (rag.query, chat.local).
- Agent execution: ia-dev-gateway script runner and SSE output.

Code modified:
- scripts/remote-data-ssh-sync.sh
- scripts/anythingllm-pull-sync/sync.mjs
- scripts/install-anythingllm-post-merge-hook.sh
- cron/git-pull-project-clones.sh
- services/smart-ide-sso-gateway/src/server.ts
- services/smart-ide-global-api/src/server.ts
- services/smart-ide-orchestrator/src/server.ts
- services/ia-dev-gateway/src/server.ts
- services/ia_dev/tools/site-generate.sh

Documentation modified:
- docs/** (architecture, API docs, ia_dev module + integration, scripts)

Configurations modified:
- config/services.local.env.example
- services/*/.env.example

Files in deploy modified:
- services/ia_dev/deploy/*

Files in logs impacted:
- logs/ia_dev.log (runtime only)
- .logs/* (runtime only)

Databases and other sources modified:
- None

Off-project modifications:
- None

Files in .smartIde modified:
- .smartIde/agents/*.md
- services/ia_dev/.smartIde/**

Files in .secrets modified:
- None

New patch version in VERSION:
- 0.0.5

CHANGELOG.md updated:
- yes
2026-04-04 18:36:43 +02:00

78 lines
2.3 KiB
Markdown

# anythingllm-pull-sync (`scripts/anythingllm-pull-sync/`)
Triggered after `git pull` via the Git `post-merge` hook: uploads files changed between `ORIG_HEAD` and `HEAD` to an AnythingLLM workspace (`POST /api/v1/document/upload`).
## Requirements
- AnythingLLM document processor reachable from the host.
- Same `.4nkaiignore` rules as the legacy IDE extension (in the target repo root).
## Environment
The generated hook sources `~/.config/4nk/anythingllm-sync.env` if present.
Required variables:
- `ANYTHINGLLM_BASE_URL` (no trailing `/`)
- `ANYTHINGLLM_API_KEY`
Optional variables:
- `ANYTHINGLLM_SYNC_MAX_FILES` (default `200`)
- `ANYTHINGLLM_SYNC_MAX_FILE_BYTES` (default `5242880`)
- `SMART_IDE_ENV` (`test|pprod|prod`, default `test`) for smart_ide project config resolution (see below)
## Workspace slug resolution
Order (first match wins):
1. `ANYTHINGLLM_WORKSPACE_SLUG` (env)
2. `repo/.anythingllm.json` with `{ "workspaceSlug": "…" }`
3. smart_ide `projects/<id>/conf.json`:
- finds the project by matching `project_path` (resolved relative to the smart_ide root) to `--repo-root`
- reads `smart_ide.anythingllm_workspace_slug[SMART_IDE_ENV]` (default env `test`)
If no slug can be resolved, the script prints an explicit message and exits with code `0` (does not block the pull).
## Installing the hook
Single repo:
```bash
./scripts/install-anythingllm-post-merge-hook.sh /path/to/repo
```
All configured project clones (from `projects/*/conf.json`):
```bash
./scripts/install-anythingllm-post-merge-hook.sh --all
```
Single configured project:
```bash
./scripts/install-anythingllm-post-merge-hook.sh --project enso
```
One-time setup on the host (from this repo):
```bash
cd scripts/anythingllm-pull-sync && npm install
```
## Behavior
- Only uploads paths from `git diff --name-only --diff-filter=ACMRT ORIG_HEAD HEAD`.
- If `ORIG_HEAD` is missing, or if AnythingLLM config is missing: explicit message, exit `0`.
- Deletions/renames are not mirrored as deletions in AnythingLLM in this version (upload only).
## Uninstall
```bash
rm -f /path/to/repo/.git/hooks/post-merge
```
## Links
[features/anythingllm-pull-sync-after-pull.md](../features/anythingllm-pull-sync-after-pull.md), [anythingllm-workspaces.md](../anythingllm-workspaces.md), [service-anythingllm-devtools.md](./service-anythingllm-devtools.md).