Platform docs, services, ia_dev submodule, smart_ide project config
- Add ia_dev submodule (projects/smart_ide on forge 4nk) - Document APIs, orchestrator, gateway, local-office, rollout - Add systemd/scripts layout; relocate setup scripts - Remove obsolete nginx/enso-only docs from this repo scope
This commit is contained in:
parent
69ab265560
commit
088eab84b7
4
.gitignore
vendored
Normal file
4
.gitignore
vendored
Normal file
@ -0,0 +1,4 @@
|
||||
# Vendored / cloned upstream trees (large; not part of smart_ide source history)
|
||||
core_ide/
|
||||
projects/
|
||||
node_modules/
|
||||
3
.gitmodules
vendored
Normal file
3
.gitmodules
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
[submodule "ia_dev"]
|
||||
path = ia_dev
|
||||
url = https://git.4nkweb.com/4nk/ia_dev.git
|
||||
21
README.md
21
README.md
@ -1,6 +1,8 @@
|
||||
# smart_ide — IDE orienté intention et IA locale
|
||||
|
||||
Projet d’environnement de développement où l’**inférence** repose sur **Ollama**, la **mémoire documentaire et RAG** sur **AnythingLLM**, et la bureautique métier sur **ONLYOFFICE**. Les **agents métier** existants (`ia_dev` et sous-agents) restent le noyau opératoire ; l’éditeur et l’orchestrateur les exposent via une **grammaire de commandes** plutôt que via une navigation fichiers classique.
|
||||
Projet d’environnement de développement où l’**inférence** repose sur **Ollama**, la **mémoire documentaire et RAG** sur **AnythingLLM**, la **bureautique métier riche** sur **ONLYOFFICE**, et l’**édition / dépôt de fichiers Office par API** (docx programmatique) via **Local Office** (`services/local-office/`). Les **agents métier** existants (`ia_dev` et sous-agents) restent le noyau opératoire ; l’éditeur et l’orchestrateur les exposent via une **grammaire de commandes** plutôt que via une navigation fichiers classique.
|
||||
|
||||
**Monorepo unique** : ce dépôt est le **référentiel principal** pour la doc, les **services locaux** (`services/`, dont Local Office), les scripts, les extensions et le **socle applicatif éditeur** (**Lapce** sous `core_ide/`, clone local hors index Git — voir [docs/core-ide.md](./docs/core-ide.md)). L’**hébergement canonique** est la **forge interne** ; les dépôts publics cités en documentation sont des **amonts** ou références, pas des cibles de publication obligatoires pour les livrables 4NK. Détail architectural : [docs/system-architecture.md](./docs/system-architecture.md).
|
||||
|
||||
## Première cible de déploiement
|
||||
|
||||
@ -15,7 +17,7 @@ L’UX (ex. Lapce) et les flux utilisateur peuvent tourner sur le client ; l’e
|
||||
|
||||
- **Pas d’explorer comme surface principale** : la navigation primaire passe par intentions, recherche, contexte, timeline, objets logiques et artefacts ; un accès brut (fichiers / arborescence) reste disponible en **mode expert / secours**, pas comme flux nominal.
|
||||
- **Machine de travail orientée opérations** plutôt qu’éditeur de fichiers : l’utilisateur exprime *ce qu’il veut faire*, *sur quel objet logique*, *avec quels droits*, *dans quel contexte projet*, *avec quelle procédure*, *avec quel agent*, *avec quel résultat attendu*.
|
||||
- **Socle éditeur envisagé : [Lapce](https://lapce.dev/)** — open source, Rust, rendu natif / GPU, positionné comme éditeur rapide et léger : base cohérente pour un noyau d’édition + agents, sans empiler l’historique complet d’un IDE classique. Choix d’architecture, pas une obligation figée.
|
||||
- **Socle applicatif éditeur : [Lapce](https://lapce.dev/)** sous **`core_ide/`** — open source, Rust, rendu natif / GPU ; base pour le noyau d’édition + agents. Mise à jour et build : [docs/core-ide.md](./docs/core-ide.md). Choix d’architecture, pas une obligation figée.
|
||||
|
||||
## AnythingLLM et projets
|
||||
|
||||
@ -23,17 +25,30 @@ Pour chaque **projet**, un **workspace AnythingLLM** dédié est créé (ou ratt
|
||||
|
||||
Voir [docs/anythingllm-workspaces.md](./docs/anythingllm-workspaces.md).
|
||||
|
||||
## Dépôt `ia_dev` (sous-module Git)
|
||||
|
||||
Le dépôt [**ia_dev**](https://git.4nkweb.com/4nk/ia_dev.git) est intégré comme **sous-module** dans le répertoire [`./ia_dev`](./ia_dev) : équipe d’agents, configs `projects/<id>/`, scripts `deploy/`, ticketing Gitea, etc. Cloner avec `git clone --recurse-submodules` ou initialiser avec `git submodule update --init --recursive`. Détail : [docs/ia_dev-submodule.md](./docs/ia_dev-submodule.md).
|
||||
|
||||
## Documentation
|
||||
|
||||
| Document | Contenu |
|
||||
|----------|---------|
|
||||
| [docs/README.md](./docs/README.md) | Index de la documentation technique |
|
||||
| [docs/README.md](./docs/README.md) | Index de la documentation technique (`docs/`, `docs/features/`, `docs/API/`) |
|
||||
| [docs/platform-target.md](./docs/platform-target.md) | Plateforme en ligne : envs test/pprod/prod, IA same-host, SSO docv |
|
||||
| [docs/API/README.md](./docs/API/README.md) | Référence HTTP des services sous `services/` (endpoints, auth, ports) |
|
||||
| [docs/infrastructure.md](./docs/infrastructure.md) | LAN, SSH, scripts d’accès hôte |
|
||||
| [docs/services.md](./docs/services.md) | Ollama, AnythingLLM Docker, intégration |
|
||||
| [docs/anythingllm-workspaces.md](./docs/anythingllm-workspaces.md) | Workspaces par projet, synchronisation |
|
||||
| [docs/ux-navigation-model.md](./docs/ux-navigation-model.md) | Remplacer l’explorer : intentions, risques, vues, graphe, mode expert |
|
||||
| [docs/system-architecture.md](./docs/system-architecture.md) | Couches, modules, agents, gateway, OpenShell, événements |
|
||||
| [docs/deployment-target.md](./docs/deployment-target.md) | Client Linux + SSH : serveur = socle IA + repos |
|
||||
| [docs/ia_dev-submodule.md](./docs/ia_dev-submodule.md) | Sous-module Git `ia_dev`, clone / mise à jour |
|
||||
| [docs/ia_dev-project-smart_ide.md](./docs/ia_dev-project-smart_ide.md) | Projet `ia_dev` `smart_ide` : `conf.json`, wiki/issues forge 4nk |
|
||||
| [docs/features/langextract-api.md](./docs/features/langextract-api.md) | API locale LangExtract (extraction structurée) |
|
||||
| [docs/features/claw-harness-api.md](./docs/features/claw-harness-api.md) | Intégration claw-code (multi-modèles, sans Anthropic dans les gabarits) |
|
||||
| [docs/features/agent-regex-search-api.md](./docs/features/agent-regex-search-api.md) | API recherche regex code (ripgrep), contexte article Cursor |
|
||||
| [docs/features/local-office.md](./docs/features/local-office.md) | Local Office : API REST docx (upload, commandes), dossier `services/local-office/` |
|
||||
| [docs/core-ide.md](./docs/core-ide.md) | Socle applicatif Lapce : répertoire `core_ide/`, clone, build |
|
||||
|
||||
## Dépôt actuel (outillage)
|
||||
|
||||
|
||||
@ -1,168 +0,0 @@
|
||||
# ia.enso.4nkweb.com — Nginx sur le proxy (192.168.1.100)
|
||||
|
||||
Reverse TLS vers l’hôte LAN **`192.168.1.164`** (Ollama + AnythingLLM ; IP substituée au déploiement via `__IA_ENSO_BACKEND_IP__` / `IA_ENSO_BACKEND_IP`).
|
||||
|
||||
## URLs publiques complètes (HTTPS)
|
||||
|
||||
| Service | URL |
|
||||
|---------|-----|
|
||||
| **AnythingLLM** (interface) | `https://ia.enso.4nkweb.com/anythingllm/` |
|
||||
| **Ollama** API native (ex. liste des modèles) | `https://ia.enso.4nkweb.com/ollama/api/tags` |
|
||||
| **Ollama** API compatible OpenAI (Cursor, etc.) | base URL `https://ia.enso.4nkweb.com/ollama/v1` — ex. `https://ia.enso.4nkweb.com/ollama/v1/models` |
|
||||
|
||||
**Bearer nginx :** tout `/ollama/` exige `Authorization: Bearer <secret>` (fichier `map` sur le proxy). La valeur n’est **pas** transmise à Ollama (`Authorization` effacé en amont). AnythingLLM sous `/anythingllm/` : auth **applicative** uniquement.
|
||||
|
||||
| Chemin (relatif) | Backend | Port LAN | Protection |
|
||||
|------------------|---------|----------|------------|
|
||||
| `/ollama/` | Ollama | `11434` | **Bearer** nginx |
|
||||
| `/anythingllm/` | AnythingLLM | `3001` | Login AnythingLLM |
|
||||
|
||||
**Contexte Cursor :** une URL en IP privée (ex. `http://192.168.1.164:11434`) peut être refusée par Cursor (`ssrf_blocked`). Un **nom public** HTTPS vers le proxy évite ce blocage si le DNS résolu depuis Internet n’est pas une IP RFC1918.
|
||||
|
||||
**Fichiers dans le dépôt :** `sites/ia.enso.4nkweb.com.conf`, `http-maps/*.example`, `deploy-ia-enso-to-proxy.sh`. Détails d’architecture : [docs/features/ia-enso-nginx-proxy-ollama-anythingllm.md](../../docs/features/ia-enso-nginx-proxy-ollama-anythingllm.md).
|
||||
|
||||
---
|
||||
|
||||
## Déploiement recommandé : script SSH
|
||||
|
||||
Depuis la racine du dépôt **`smart_ide`**, sur une machine avec accès SSH au bastion puis au proxy :
|
||||
|
||||
```bash
|
||||
export IA_ENSO_OLLAMA_BEARER_TOKEN='secret-ascii-sans-guillemets-ni-backslash'
|
||||
# accès LAN direct au proxy (.100), sans bastion :
|
||||
# export DEPLOY_SSH_PROXY_HOST=
|
||||
./deploy/nginx/deploy-ia-enso-to-proxy.sh
|
||||
```
|
||||
|
||||
Si `IA_ENSO_OLLAMA_BEARER_TOKEN` est absent, le script génère un token hex (affichage unique) à conserver pour Cursor.
|
||||
|
||||
### Prérequis sur le proxy
|
||||
|
||||
- `http { include /etc/nginx/conf.d/*.conf; ... }` dans `/etc/nginx/nginx.conf` (sinon le script échoue avec un message explicite).
|
||||
- **Certificats** Let’s Encrypt pour `ia.enso.4nkweb.com` aux chemins du fichier site (`/etc/letsencrypt/live/ia.enso.4nkweb.com/fullchain.pem` et `privkey.pem`). Sans eux, le bloc `listen 443` fait échouer `nginx -t` : voir **Bootstrap TLS** ci-dessous.
|
||||
- **`sudo` non interactif** pour `nginx` et `systemctl reload nginx`.
|
||||
|
||||
### Bootstrap TLS (première fois, `nginx -t` impossible)
|
||||
|
||||
1. DNS : `ia.enso.4nkweb.com` doit résoudre vers l’entrée publique qui atteint ce proxy (HTTP port 80).
|
||||
2. Sur le proxy :
|
||||
|
||||
```bash
|
||||
sudo install -d -m 0755 /var/www/certbot
|
||||
# Remplacer temporairement le vhost par HTTP seul (fichier dans le dépôt : sites/ia.enso.4nkweb.com.http-only.conf)
|
||||
sudo cp /chemin/smart_ide/deploy/nginx/sites/ia.enso.4nkweb.com.http-only.conf /etc/nginx/sites-available/ia.enso.4nkweb.com.conf
|
||||
sudo nginx -t && sudo systemctl reload nginx
|
||||
sudo certbot certonly --webroot -w /var/www/certbot -d ia.enso.4nkweb.com --non-interactive --agree-tos --register-unsafely-without-email
|
||||
```
|
||||
|
||||
3. Déployer la config complète : `./deploy/nginx/deploy-ia-enso-to-proxy.sh` (rétablit HTTPS + upstreams).
|
||||
|
||||
### Fichiers installés par le script
|
||||
|
||||
| Chemin sur le proxy | Rôle |
|
||||
|---------------------|------|
|
||||
| `/etc/nginx/conf.d/ia-enso-http-maps.conf` | `map_hash_bucket_size`, `map` Bearer `$ia_enso_ollama_authorized`, et souvent `map` WebSocket |
|
||||
| `/etc/nginx/sites-available/ia.enso.4nkweb.com.conf` | `server` HTTP→HTTPS + HTTPS |
|
||||
| Lien `sites-enabled/ia.enso.4nkweb.com.conf` | Activation du vhost |
|
||||
|
||||
Si `nginx -t` échoue à cause d’un **doublon** `map $http_upgrade $connection_upgrade`, le script retente avec **Bearer seul** (sans dupliquer le `map` WebSocket).
|
||||
|
||||
### Variables d’environnement du script
|
||||
|
||||
| Variable | Défaut | Rôle |
|
||||
|----------|--------|------|
|
||||
| `IA_ENSO_OLLAMA_BEARER_TOKEN` | généré (`openssl rand -hex 32`) | Secret pour `Authorization: Bearer …` |
|
||||
| `IA_ENSO_SSH_KEY` | `~/.ssh/id_ed25519` | Clé privée SSH |
|
||||
| `IA_ENSO_PROXY_USER` | `ncantu` | Utilisateur SSH sur le proxy |
|
||||
| `IA_ENSO_PROXY_HOST` | `192.168.1.100` | Cible SSH (IP ou hostname LAN) |
|
||||
| `DEPLOY_SSH_PROXY_HOST` | `4nk.myftp.biz` | Bastion ProxyJump ; vide = SSH direct |
|
||||
| `DEPLOY_SSH_PROXY_USER` | idem proxy | Utilisateur sur le bastion |
|
||||
| `IA_ENSO_BACKEND_IP` | `192.168.1.164` | Hôte Ollama + AnythingLLM (IPv4) |
|
||||
|
||||
Bibliothèque utilisée : `ia_dev/deploy/_lib/ssh.sh` (`BatchMode=yes`).
|
||||
|
||||
---
|
||||
|
||||
## Déploiement manuel (sans script)
|
||||
|
||||
### 1. DNS et TLS
|
||||
|
||||
Le DNS doit résoudre `ia.enso.4nkweb.com` vers l’entrée publique qui atteint ce proxy.
|
||||
|
||||
```bash
|
||||
sudo certbot certonly --webroot -w /var/www/certbot -d ia.enso.4nkweb.com
|
||||
```
|
||||
|
||||
Adapter dans `sites/ia.enso.4nkweb.com.conf` les directives `ssl_certificate` / `ssl_certificate_key` si le répertoire `live/` diffère.
|
||||
|
||||
### 2. Maps HTTP (Bearer + WebSocket)
|
||||
|
||||
Le script déploie `ia-enso-http-maps.conf` avec `map_hash_bucket_size 256`, le `map` Bearer et le `map` WebSocket (ou Bearer seul si doublon WebSocket ailleurs). Installation manuelle : combiner `http-maps/ia-enso-ollama-bearer.map.conf.example` et `websocket-connection.map.conf.example` dans `http { }` si besoin.
|
||||
|
||||
### 3. Fichier `server`
|
||||
|
||||
Le fichier dans le dépôt contient le marqueur `__IA_ENSO_BACKEND_IP__`. Remplacer par l’IPv4 du backend (ex. `192.168.1.164`) avant copie, ou utiliser :
|
||||
|
||||
```bash
|
||||
sed "s/__IA_ENSO_BACKEND_IP__/192.168.1.164/g" deploy/nginx/sites/ia.enso.4nkweb.com.conf | sudo tee /etc/nginx/sites-available/ia.enso.4nkweb.com.conf >/dev/null
|
||||
```
|
||||
|
||||
Sans `sed` : éditer le fichier sur le proxy pour remplacer `__IA_ENSO_BACKEND_IP__` par l’IPv4 réelle, puis :
|
||||
|
||||
```bash
|
||||
sudo ln -sf /etc/nginx/sites-available/ia.enso.4nkweb.com.conf /etc/nginx/sites-enabled/
|
||||
sudo nginx -t && sudo systemctl reload nginx
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Vérifications
|
||||
|
||||
### API Ollama via le proxy
|
||||
|
||||
```bash
|
||||
curl -sS -o /dev/null -w "%{http_code}\n" \
|
||||
-H "Authorization: Bearer <secret>" \
|
||||
https://ia.enso.4nkweb.com/ollama/v1/models
|
||||
```
|
||||
|
||||
Attendu : **200** avec le bon secret ; **401** sans `Authorization` ou secret incorrect.
|
||||
|
||||
### AnythingLLM
|
||||
|
||||
Navigateur : `https://ia.enso.4nkweb.com/anythingllm/` (redirection vers `/anythingllm/`). Connexion avec les identifiants **AnythingLLM**.
|
||||
Si les assets statiques échouent, vérifier la doc upstream (sous-chemin, en-têtes `X-Forwarded-*`).
|
||||
|
||||
### Cursor
|
||||
|
||||
- URL de base OpenAI : `https://ia.enso.4nkweb.com/ollama/v1`
|
||||
- Clé API : **identique** au secret du `map` nginx (sans préfixe `Bearer ` dans le champ).
|
||||
|
||||
**`streamFromAgentBackend` (comportement observé)** : dans l’application Cursor (bundle Electron), le flux **Agent / chat** appelle une couche interne qui ouvre un stream vers **les serveurs Cursor** (`getAgentStreamResponse`, etc.), pas un `fetch` direct depuis ton poste vers ton URL OpenAI override. Cursor peut donc valider une **« User API key »** ou des droits **avant** ou **en parallèle** de l’usage de l’override. Si **`curl`** avec le Bearer vers `/ollama/v1/models` renvoie **200** mais Cursor affiche **`ERROR_BAD_USER_API_KEY`**, l’échec reste **côté client / infra Cursor** : [forum](https://forum.cursor.com/t/unauthorized-user-api-key-with-custom-openai-api-key-url/132572). Le code minifié du produit n’est pas dans ce dépôt ; seuls les noms de fonctions dans la stack trace décrivent ce chemin d’exécution.
|
||||
|
||||
---
|
||||
|
||||
## Pare-feu backend
|
||||
|
||||
Sur **`192.168.1.164`**, n’autoriser **11434** et **3001** TCP que depuis **192.168.1.100** (proxy) si un pare-feu hôte est actif.
|
||||
|
||||
---
|
||||
|
||||
## Rotation du secret Bearer
|
||||
|
||||
1. Mettre à jour `"Bearer …"` dans `/etc/nginx/conf.d/ia-enso-http-maps.conf` (ou redéployer avec `IA_ENSO_OLLAMA_BEARER_TOKEN`).
|
||||
2. `sudo nginx -t && sudo systemctl reload nginx`.
|
||||
3. Mettre à jour la clé API dans Cursor (et tout autre client).
|
||||
|
||||
---
|
||||
|
||||
## Dépannage
|
||||
|
||||
| Symptôme | Piste |
|
||||
|----------|--------|
|
||||
| `nginx -t` erreur sur `connection_upgrade` | Doublon de `map $http_upgrade $connection_upgrade` : retirer l’un des blocs ou laisser le déploiement « Bearer seul » du script. |
|
||||
| `could not build map_hash` / `map_hash_bucket_size` | Secret Bearer long : le fichier généré par le script inclut `map_hash_bucket_size 256;`. |
|
||||
| `401` sur `/ollama/` | Secret différent entre client et `map` ; en-tête `Authorization` absent ou mal formé. |
|
||||
| `502` / timeout | Ollama ou AnythingLLM arrêtés sur le backend ; pare-feu ; mauvaise IP dans `upstream` (vérifier `grep server /etc/nginx/sites-available/ia.enso.4nkweb.com.conf` sur le proxy ; redéployer avec `IA_ENSO_BACKEND_IP=192.168.1.164`). |
|
||||
| Erreur SSL / `cannot load certificate` | Certificat absent : exécuter certbot sur le proxy pour `ia.enso.4nkweb.com`, ou adapter les chemins `ssl_certificate` dans le fichier site. |
|
||||
| Cursor `ssrf_blocked` | L’hôte utilisé résout encore vers une IP privée côté infrastructure Cursor ; vérifier DNS public / NAT. |
|
||||
@ -1,125 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# Push ia.enso.4nkweb.com nginx config to the LAN proxy (192.168.1.100) over SSH.
|
||||
# Requires passwordless sudo for nginx on the proxy host.
|
||||
#
|
||||
# Environment:
|
||||
# IA_ENSO_OLLAMA_BEARER_TOKEN Bearer secret for /ollama (if unset, openssl rand -hex 32).
|
||||
# IA_ENSO_SSH_KEY SSH private key (default: ~/.ssh/id_ed25519).
|
||||
# IA_ENSO_PROXY_USER SSH user on proxy (default: ncantu).
|
||||
# IA_ENSO_PROXY_HOST Proxy IP or hostname (default: 192.168.1.100).
|
||||
# IA_ENSO_BACKEND_IP Ollama + AnythingLLM host IPv4 (default: 192.168.1.164).
|
||||
# DEPLOY_SSH_PROXY_HOST Jump host (default: 4nk.myftp.biz); empty = direct SSH to proxy.
|
||||
# DEPLOY_SSH_PROXY_USER Jump user (default: same as IA_ENSO_PROXY_USER).
|
||||
#
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
SMART_IDE_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||
SSH_LIB="${SMART_IDE_ROOT}/ia_dev/deploy/_lib/ssh.sh"
|
||||
|
||||
if [[ ! -f "$SSH_LIB" ]]; then
|
||||
echo "Missing ${SSH_LIB} (ia_dev submodule checkout?)" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# shellcheck source=/dev/null
|
||||
source "$SSH_LIB"
|
||||
|
||||
IA_ENSO_SSH_KEY="${IA_ENSO_SSH_KEY:-${HOME}/.ssh/id_ed25519}"
|
||||
IA_ENSO_PROXY_USER="${IA_ENSO_PROXY_USER:-ncantu}"
|
||||
IA_ENSO_PROXY_HOST="${IA_ENSO_PROXY_HOST:-192.168.1.100}"
|
||||
IA_ENSO_BACKEND_IP="${IA_ENSO_BACKEND_IP:-192.168.1.164}"
|
||||
DEPLOY_SSH_PROXY_USER="${DEPLOY_SSH_PROXY_USER:-$IA_ENSO_PROXY_USER}"
|
||||
if [[ ! -v DEPLOY_SSH_PROXY_HOST ]]; then
|
||||
export DEPLOY_SSH_PROXY_HOST='4nk.myftp.biz'
|
||||
elif [[ -z "$DEPLOY_SSH_PROXY_HOST" ]]; then
|
||||
unset DEPLOY_SSH_PROXY_HOST
|
||||
fi
|
||||
export DEPLOY_SSH_PROXY_USER
|
||||
|
||||
TOKEN="${IA_ENSO_OLLAMA_BEARER_TOKEN:-}"
|
||||
if [[ -z "$TOKEN" ]]; then
|
||||
TOKEN="$(openssl rand -hex 32)"
|
||||
echo "IA_ENSO_OLLAMA_BEARER_TOKEN was unset; generated token (store for Cursor API key):"
|
||||
echo "$TOKEN"
|
||||
echo "---"
|
||||
fi
|
||||
|
||||
if [[ "$TOKEN" == *'"'* ]] || [[ "$TOKEN" == *'\'* ]]; then
|
||||
echo "Token must not contain double quotes or backslashes." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ ! "$IA_ENSO_BACKEND_IP" =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
|
||||
echo "IA_ENSO_BACKEND_IP must be an IPv4 address (got: ${IA_ENSO_BACKEND_IP})" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# mode: full = websocket + bearer; bearer_only = bearer + map_hash (duplicate websocket elsewhere)
|
||||
write_maps_file() {
|
||||
local path="$1"
|
||||
local mode="$2"
|
||||
{
|
||||
cat <<'HASHOF'
|
||||
map_hash_bucket_size 256;
|
||||
HASHOF
|
||||
if [[ "$mode" == "full" ]]; then
|
||||
cat <<'MAPEOF'
|
||||
map $http_upgrade $connection_upgrade {
|
||||
default upgrade;
|
||||
'' close;
|
||||
}
|
||||
MAPEOF
|
||||
fi
|
||||
cat <<MAPEOF
|
||||
map \$http_authorization \$ia_enso_ollama_authorized {
|
||||
default 0;
|
||||
"Bearer ${TOKEN}" 1;
|
||||
}
|
||||
MAPEOF
|
||||
} >"$path"
|
||||
}
|
||||
|
||||
TMP_DIR="$(mktemp -d)"
|
||||
cleanup() {
|
||||
rm -rf "$TMP_DIR"
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
try_install() {
|
||||
local mode="$1"
|
||||
write_maps_file "${TMP_DIR}/ia-enso-http-maps.conf" "$mode"
|
||||
sed "s/__IA_ENSO_BACKEND_IP__/${IA_ENSO_BACKEND_IP}/g" "${SCRIPT_DIR}/sites/ia.enso.4nkweb.com.conf" >"${TMP_DIR}/ia.enso.4nkweb.com.conf"
|
||||
scp_copy "$IA_ENSO_SSH_KEY" "${TMP_DIR}/ia-enso-http-maps.conf" "$IA_ENSO_PROXY_USER" "$IA_ENSO_PROXY_HOST" "/tmp/ia-enso-http-maps.conf"
|
||||
scp_copy "$IA_ENSO_SSH_KEY" "${TMP_DIR}/ia.enso.4nkweb.com.conf" "$IA_ENSO_PROXY_USER" "$IA_ENSO_PROXY_HOST" "/tmp/ia.enso.4nkweb.com.conf"
|
||||
ssh_run "$IA_ENSO_SSH_KEY" "$IA_ENSO_PROXY_USER" "$IA_ENSO_PROXY_HOST" bash <<'REMOTE'
|
||||
set -euo pipefail
|
||||
sudo install -d -m 0755 /etc/nginx/conf.d
|
||||
sudo install -m 0644 /tmp/ia-enso-http-maps.conf /etc/nginx/conf.d/ia-enso-http-maps.conf
|
||||
sudo install -m 0644 /tmp/ia.enso.4nkweb.com.conf /etc/nginx/sites-available/ia.enso.4nkweb.com.conf
|
||||
sudo ln -sf /etc/nginx/sites-available/ia.enso.4nkweb.com.conf /etc/nginx/sites-enabled/ia.enso.4nkweb.com.conf
|
||||
rm -f /tmp/ia-enso-http-maps.conf /tmp/ia.enso.4nkweb.com.conf
|
||||
if ! grep -q 'include /etc/nginx/conf.d/\*\.conf;' /etc/nginx/nginx.conf; then
|
||||
echo "ERROR: /etc/nginx/nginx.conf must include conf.d inside http { }." >&2
|
||||
echo "Add: include /etc/nginx/conf.d/*.conf;" >&2
|
||||
exit 1
|
||||
fi
|
||||
sudo nginx -t
|
||||
sudo systemctl reload nginx
|
||||
echo "nginx reload OK"
|
||||
REMOTE
|
||||
}
|
||||
|
||||
echo "Deploying ia.enso upstreams to ${IA_ENSO_BACKEND_IP} (Ollama :11434, AnythingLLM :3001)."
|
||||
|
||||
if ! try_install full; then
|
||||
echo "Retrying with Bearer map only (websocket map likely already defined on proxy)..."
|
||||
if ! try_install bearer_only; then
|
||||
echo "Deploy failed (SSH, sudo, nginx -t, or missing include /etc/nginx/conf.d/*.conf)." >&2
|
||||
echo "Re-run from a host with SSH access (LAN: DEPLOY_SSH_PROXY_HOST=); set IA_ENSO_OLLAMA_BEARER_TOKEN to reuse secret." >&2
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "Done. Bearer required on /ollama/. Cursor base: https://ia.enso.4nkweb.com/ollama/v1 — API key = token above (if generated) or IA_ENSO_OLLAMA_BEARER_TOKEN."
|
||||
@ -1,13 +0,0 @@
|
||||
# Bearer gate for /ollama/ (matches default site: if ($ia_enso_ollama_authorized = 0) { return 401; }).
|
||||
# Install inside `http { ... }` before server blocks that use $ia_enso_ollama_authorized:
|
||||
# include /etc/nginx/http-maps/ia-enso-ollama-bearer.map.conf;
|
||||
#
|
||||
# Copy without the .example suffix; set secret (ASCII, no double quotes in value).
|
||||
# Cursor: OpenAI base .../ollama/v1 and API key = same secret (no "Bearer " in field).
|
||||
|
||||
map_hash_bucket_size 256;
|
||||
|
||||
map $http_authorization $ia_enso_ollama_authorized {
|
||||
default 0;
|
||||
"Bearer CHANGE_ME_TO_LONG_RANDOM_SECRET" 1;
|
||||
}
|
||||
@ -1,7 +0,0 @@
|
||||
# Place inside `http { ... }` on the proxy (once per nginx instance), e.g.:
|
||||
# include /etc/nginx/http-maps/websocket-connection.map.conf;
|
||||
|
||||
map $http_upgrade $connection_upgrade {
|
||||
default upgrade;
|
||||
'' close;
|
||||
}
|
||||
@ -1,96 +0,0 @@
|
||||
# ia.enso.4nkweb.com — reverse proxy to LAN host (Ollama + AnythingLLM).
|
||||
#
|
||||
# Public HTTPS URLs (after TLS + nginx reload):
|
||||
# AnythingLLM UI: https://ia.enso.4nkweb.com/anythingllm/
|
||||
# Ollama OpenAI API: https://ia.enso.4nkweb.com/ollama/v1/ (e.g. .../v1/models, .../v1/chat/completions)
|
||||
# Ollama native API: https://ia.enso.4nkweb.com/ollama/api/tags (and other /api/* paths)
|
||||
# /ollama/* requires Authorization: Bearer <secret> at nginx (map in conf.d); secret not forwarded to Ollama.
|
||||
# Cursor base URL: https://ia.enso.4nkweb.com/ollama/v1 — API key field = same secret (no "Bearer " prefix).
|
||||
#
|
||||
# Prerequisites on the proxy host:
|
||||
# - TLS certificate for ia.enso.4nkweb.com (e.g. certbot).
|
||||
# - HTTP map $ia_enso_ollama_authorized (see deploy script / http-maps/ia-enso-ollama-bearer.map.conf.example).
|
||||
#
|
||||
# Upstream backend: replaced at deploy time (default 192.168.1.164). Manual install: replace __IA_ENSO_BACKEND_IP__.
|
||||
|
||||
upstream ia_enso_ollama {
|
||||
server __IA_ENSO_BACKEND_IP__:11434;
|
||||
keepalive 8;
|
||||
}
|
||||
|
||||
upstream ia_enso_anythingllm {
|
||||
server __IA_ENSO_BACKEND_IP__:3001;
|
||||
keepalive 8;
|
||||
}
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name ia.enso.4nkweb.com;
|
||||
|
||||
location /.well-known/acme-challenge/ {
|
||||
root /var/www/certbot;
|
||||
}
|
||||
|
||||
location / {
|
||||
return 301 https://$host$request_uri;
|
||||
}
|
||||
}
|
||||
|
||||
server {
|
||||
listen 443 ssl http2;
|
||||
server_name ia.enso.4nkweb.com;
|
||||
|
||||
ssl_certificate /etc/letsencrypt/live/ia.enso.4nkweb.com/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/ia.enso.4nkweb.com/privkey.pem;
|
||||
|
||||
ssl_protocols TLSv1.2 TLSv1.3;
|
||||
ssl_prefer_server_ciphers on;
|
||||
ssl_session_cache shared:SSL:10m;
|
||||
ssl_session_timeout 10m;
|
||||
|
||||
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
|
||||
add_header X-Frame-Options "SAMEORIGIN" always;
|
||||
add_header X-Content-Type-Options "nosniff" always;
|
||||
|
||||
client_max_body_size 100M;
|
||||
|
||||
# Ollama: nginx Bearer gate (map $ia_enso_ollama_authorized); Authorization cleared upstream.
|
||||
location /ollama/ {
|
||||
if ($ia_enso_ollama_authorized = 0) {
|
||||
return 401;
|
||||
}
|
||||
|
||||
proxy_pass http://ia_enso_ollama/;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
proxy_set_header Connection "";
|
||||
|
||||
proxy_set_header Authorization "";
|
||||
|
||||
proxy_buffering off;
|
||||
proxy_read_timeout 3600s;
|
||||
proxy_send_timeout 3600s;
|
||||
}
|
||||
|
||||
# AnythingLLM UI + API (application login). Subpath stripped when forwarding.
|
||||
location /anythingllm/ {
|
||||
proxy_pass http://ia_enso_anythingllm/;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection $connection_upgrade;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
proxy_set_header X-Forwarded-Prefix /anythingllm;
|
||||
proxy_read_timeout 3600s;
|
||||
proxy_send_timeout 3600s;
|
||||
}
|
||||
|
||||
location = /anythingllm {
|
||||
return 301 https://$host/anythingllm/;
|
||||
}
|
||||
}
|
||||
@ -1,15 +0,0 @@
|
||||
# Temporary: HTTP only for initial Let's Encrypt webroot challenge.
|
||||
# Replace with ia.enso.4nkweb.com.conf after cert exists under live/ia.enso.4nkweb.com/.
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name ia.enso.4nkweb.com;
|
||||
|
||||
location /.well-known/acme-challenge/ {
|
||||
root /var/www/certbot;
|
||||
}
|
||||
|
||||
location / {
|
||||
return 301 https://$host$request_uri;
|
||||
}
|
||||
}
|
||||
21
docs/API/README.md
Normal file
21
docs/API/README.md
Normal file
@ -0,0 +1,21 @@
|
||||
# Référence API — services `smart_ide`
|
||||
|
||||
Documentation des **API HTTP** exposées par les services sous [`services/`](../../services/). Chaque service écoute en principe sur **`127.0.0.1`** ; ports et variables d’environnement sont rappelés par fiche.
|
||||
|
||||
| Service | Auth | Port défaut | Fiche |
|
||||
|---------|------|-------------|--------|
|
||||
| **repos-devtools-server** | `Authorization: Bearer` | `37140` | [repos-devtools-server.md](./repos-devtools-server.md) |
|
||||
| **langextract-api** | Bearer optionnel | `37141` | [langextract-api.md](./langextract-api.md) |
|
||||
| **claw-harness-api** (proxy) | Bearer | `37142` | [claw-harness-proxy.md](./claw-harness-proxy.md) |
|
||||
| **agent-regex-search-api** | Bearer (sauf `/health`) | `37143` | [agent-regex-search-api.md](./agent-regex-search-api.md) |
|
||||
| **local-office** | `X-API-Key` | `8000` (exemple run) | [local-office.md](./local-office.md) |
|
||||
| **ia-dev-gateway** | Bearer | `37144` (spécification) | [ia-dev-gateway.md](./ia-dev-gateway.md) |
|
||||
| **smart_ide-orchestrator** | Bearer (spécification) | `37145` (spécification) | [orchestrator.md](./orchestrator.md) |
|
||||
|
||||
**OpenAPI** : FastAPI expose une spec interactive pour **langextract-api** (`/docs`) et **local-office** (`/docs`) une fois le service démarré.
|
||||
|
||||
**Amont claw-code** : le binaire / serveur HTTP réel est hors de ce dépôt ; seul le **proxy** documenté ici fait partie du monorepo.
|
||||
|
||||
**Implémentation minimale** : **ia-dev-gateway** et **smart_ide-orchestrator** ont un serveur Node/TS dans le monorepo (`npm run build` dans chaque dossier). Le branchement runner `ia_dev` et le proxy HTTP complet de l’orchestrateur restent à étendre.
|
||||
|
||||
Voir aussi : [services.md](../services.md), [system-architecture.md](../system-architecture.md), README de chaque dossier sous `services/`.
|
||||
68
docs/API/agent-regex-search-api.md
Normal file
68
docs/API/agent-regex-search-api.md
Normal file
@ -0,0 +1,68 @@
|
||||
# API — agent-regex-search-api
|
||||
|
||||
Service Node : recherche **regex** sur fichiers via **ripgrep** (`rg`), résultats en JSON. Périmètre confiné à `REGEX_SEARCH_ROOT`.
|
||||
|
||||
- **Code** : [`services/agent-regex-search-api/`](../../services/agent-regex-search-api/)
|
||||
- **Bind** : `REGEX_SEARCH_HOST` (défaut `127.0.0.1`)
|
||||
- **Port** : `REGEX_SEARCH_PORT` (défaut `37143`)
|
||||
- **Prérequis** : `rg` dans le `PATH` (sinon réponse `503` sur `/search`)
|
||||
|
||||
## Authentification
|
||||
|
||||
```http
|
||||
Authorization: Bearer <REGEX_SEARCH_TOKEN>
|
||||
```
|
||||
|
||||
`REGEX_SEARCH_TOKEN` obligatoire au démarrage. **Exception** : `GET /health` ne exige pas le Bearer.
|
||||
|
||||
## Endpoints
|
||||
|
||||
### `GET /health` ou `GET /health/`
|
||||
|
||||
**Réponse `200`**
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "ok",
|
||||
"root": "<REGEX_SEARCH_ROOT résolu>"
|
||||
}
|
||||
```
|
||||
|
||||
### `POST /search`
|
||||
|
||||
**Corps JSON**
|
||||
|
||||
| Champ | Obligatoire | Description |
|
||||
|-------|-------------|-------------|
|
||||
| `pattern` | oui | Regex style Rust, passée à ripgrep |
|
||||
| `subpath` | non | Chemin relatif sous la racine (pas de `..`, pas absolu) |
|
||||
| `maxMatches` | non | Plafond de résultats (défaut `500`, max `50000`) |
|
||||
| `timeoutMs` | non | Timeout exécution `rg` en ms (défaut `60000`, max `300000`) |
|
||||
|
||||
**Réponse `200`** (succès ripgrep, y compris « aucune correspondance », code sortie `1`)
|
||||
|
||||
```json
|
||||
{
|
||||
"root": "string",
|
||||
"target": "string",
|
||||
"matches": [{ "path": "string", "lineNumber": number, "line": "string" }],
|
||||
"truncated": boolean,
|
||||
"exitCode": number
|
||||
}
|
||||
```
|
||||
|
||||
**Autres réponses**
|
||||
|
||||
- `400` : corps invalide, `pattern` manquant, ou erreur ripgrep code `2` (regex / IO) — peut inclure `error`, `matches`, `truncated`, `exitCode`
|
||||
- `401` / absence de réponse utile : Bearer manquant ou incorrect sur `/search`
|
||||
- `404` : chemin non géré
|
||||
- `503` : `rg` introuvable (`exitCode` 127 côté implémentation) — `{ "error", "matches": [], "truncated": false }`
|
||||
|
||||
## Variables d’environnement
|
||||
|
||||
| Variable | Obligatoire | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `REGEX_SEARCH_TOKEN` | oui | Secret Bearer |
|
||||
| `REGEX_SEARCH_ROOT` | non | Répertoire de base des recherches (défaut `/home/ncantu/code`) |
|
||||
| `REGEX_SEARCH_HOST` | non | Bind |
|
||||
| `REGEX_SEARCH_PORT` | non | Port |
|
||||
44
docs/API/claw-harness-proxy.md
Normal file
44
docs/API/claw-harness-proxy.md
Normal file
@ -0,0 +1,44 @@
|
||||
# API — claw-harness-api (proxy HTTP)
|
||||
|
||||
Le dossier [`services/claw-harness-api/`](../../services/claw-harness-api/) documente l’intégration **claw-code** (amont, hors monorepo). Ce fichier décrit uniquement le **proxy Node** sous `services/claw-harness-api/proxy/`, qui aligne sécurité et bind avec les autres services locaux.
|
||||
|
||||
- **Bind** : `CLAW_PROXY_HOST` (défaut `127.0.0.1`)
|
||||
- **Port** : `CLAW_PROXY_PORT` (défaut `37142`)
|
||||
- **Amont** : `CLAW_UPSTREAM_URL` — URL de base du serveur HTTP claw-code (ex. `http://127.0.0.1:37143`)
|
||||
|
||||
## Authentification
|
||||
|
||||
Sur le proxy, les requêtes (hors `/health`) doivent inclure :
|
||||
|
||||
```http
|
||||
Authorization: Bearer <CLAW_PROXY_TOKEN>
|
||||
```
|
||||
|
||||
`CLAW_PROXY_TOKEN` est obligatoire au démarrage. Les en-têtes de la requête cliente (hors hop-by-hop et `Host`) sont recopiés vers l’amont ; l’amont peut avoir sa propre politique d’auth.
|
||||
|
||||
## Endpoints (côté proxy)
|
||||
|
||||
### `GET /health` et `GET /health/`
|
||||
|
||||
**Réponse `200`**
|
||||
|
||||
```json
|
||||
{ "status": "ok" }
|
||||
```
|
||||
|
||||
Sans Bearer.
|
||||
|
||||
### Toute autre méthode et chemin (authentifié)
|
||||
|
||||
Après validation du Bearer, le proxy construit l’URL cible : `CLAW_UPSTREAM_URL` + chemin et query de la requête entrante, puis **transfère** méthode, corps et en-têtes (filtrés) vers l’amont. Le corps de réponse et le code statut viennent de l’amont (ou `502` en cas d’erreur de connexion).
|
||||
|
||||
Il n’y a **pas** de catalogue d’routes fixe dans le monorepo : les chemins effectifs dépendent du serveur HTTP claw-code déployé.
|
||||
|
||||
## Variables d’environnement
|
||||
|
||||
| Variable | Obligatoire | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `CLAW_PROXY_TOKEN` | oui | Secret Bearer côté clients du proxy |
|
||||
| `CLAW_UPSTREAM_URL` | oui | Base URL du serveur claw HTTP |
|
||||
| `CLAW_PROXY_HOST` | non | Bind |
|
||||
| `CLAW_PROXY_PORT` | non | Port d’écoute du proxy |
|
||||
58
docs/API/ia-dev-gateway.md
Normal file
58
docs/API/ia-dev-gateway.md
Normal file
@ -0,0 +1,58 @@
|
||||
# API — ia-dev-gateway (spécification)
|
||||
|
||||
Service prévu sous [`services/ia-dev-gateway/`](../../services/ia-dev-gateway/). Auth **`Authorization: Bearer`** avec `IA_DEV_GATEWAY_TOKEN`. Bind par défaut **`127.0.0.1`**, port par défaut **`37144`**.
|
||||
|
||||
## `GET /health`
|
||||
|
||||
**200** : `{ "status": "ok" }` (sans Bearer).
|
||||
|
||||
## `GET /v1/agents`
|
||||
|
||||
Liste les agents exposés par le registre `ia_dev` (fichiers de définition sous `.cursor/agents/` ou équivalent documenté dans le fork).
|
||||
|
||||
**200** : `{ "agents": [ { "id", "name", "summary", "triggerCommands": string[] } ] }`
|
||||
|
||||
**401** : Bearer manquant ou invalide.
|
||||
|
||||
## `GET /v1/agents/{id}`
|
||||
|
||||
**200** : descripteur étendu : `id`, `name`, `role`, `inputs`, `outputs`, `rights`, `dependencies`, `scripts`, `risk`, `compatibleEnvs`.
|
||||
|
||||
**404** : agent inconnu.
|
||||
|
||||
## `POST /v1/runs`
|
||||
|
||||
Démarre une exécution (agent, script deploy, ou intention résolue).
|
||||
|
||||
**Corps JSON**
|
||||
|
||||
| Champ | Obligatoire | Description |
|
||||
|-------|-------------|-------------|
|
||||
| `agentId` | oui* | Identifiant agent (`*` requis si pas `scriptPath` — voir évolution schéma) |
|
||||
| `projectId` | oui | Ex. `lecoffreio` — répertoire `projects/<projectId>/` sous `ia_dev` |
|
||||
| `intent` | oui | Libellé d’intention (`ask`, `fix`, `deploy`, …) |
|
||||
| `payload` | non | Objet JSON opaque pour le runner |
|
||||
| `env` | non | `test` \| `pprod` \| `prod` — contrôle des scripts autorisés |
|
||||
|
||||
**200** : `{ "runId": "string", "status": "queued" | "running" }`
|
||||
|
||||
**403** : droit refusé pour `env` ou `projectId`.
|
||||
|
||||
**422** : corps invalide.
|
||||
|
||||
## `GET /v1/runs/{runId}`
|
||||
|
||||
**200** : `{ "runId", "status", "startedAt", "finishedAt"?, "exitCode"?, "summary"?, "error"? }`
|
||||
|
||||
**404** : run inconnu.
|
||||
|
||||
## `GET /v1/runs/{runId}/events`
|
||||
|
||||
**Server-Sent Events** (recommandé en v1) : lignes `data: {JSON}\n\n` avec types d’événements alignés sur [system-architecture.md](../system-architecture.md) (`started`, `tool_selected`, `script_started`, `model_called`, `waiting_validation`, `completed`, `failed`, `rolled_back`, `artifact_created`).
|
||||
|
||||
**401** / **404** selon le cas.
|
||||
|
||||
## Notes
|
||||
|
||||
- Les détails d’exécution réels (spawn process, Docker, SSH) restent dans le **runner** branché sur `IA_DEV_ROOT` ; cette API est le **contrat** pour l’orchestrateur et les UIs.
|
||||
- Versionnement : préfixe `/v1/` pour évolutions compatibles.
|
||||
87
docs/API/langextract-api.md
Normal file
87
docs/API/langextract-api.md
Normal file
@ -0,0 +1,87 @@
|
||||
# API — langextract-api
|
||||
|
||||
Service FastAPI : enveloppe [LangExtract](https://github.com/google/langextract) pour extractions structurées depuis du texte.
|
||||
|
||||
- **Code** : [`services/langextract-api/`](../../services/langextract-api/)
|
||||
- **Bind** : `LANGEXTRACT_API_HOST` (défaut `127.0.0.1`)
|
||||
- **Port** : `LANGEXTRACT_API_PORT` (défaut `37141`)
|
||||
- **OpenAPI** : `GET http://<host>:<port>/docs` une fois le service lancé
|
||||
|
||||
## Authentification
|
||||
|
||||
Si `LANGEXTRACT_SERVICE_TOKEN` est défini (non vide), toutes les routes **sauf** celles sans dépendance explicite doivent envoyer :
|
||||
|
||||
```http
|
||||
Authorization: Bearer <LANGEXTRACT_SERVICE_TOKEN>
|
||||
```
|
||||
|
||||
Actuellement **`/health`** n’impose pas le Bearer ; **`/extract`** impose le Bearer lorsque le token service est configuré.
|
||||
|
||||
## Endpoints
|
||||
|
||||
### `GET /health`
|
||||
|
||||
**Réponse `200`**
|
||||
|
||||
```json
|
||||
{ "status": "ok" }
|
||||
```
|
||||
|
||||
### `POST /extract`
|
||||
|
||||
Exécute une extraction LangExtract.
|
||||
|
||||
**Corps JSON** (modèle Pydantic `ExtractRequest`)
|
||||
|
||||
| Champ | Obligatoire | Description |
|
||||
|-------|-------------|-------------|
|
||||
| `text` | oui | Texte source |
|
||||
| `prompt_description` | oui | Consigne d’extraction |
|
||||
| `examples` | oui | Liste d’exemples (voir ci-dessous) |
|
||||
| `model_id` | oui | Identifiant modèle (ex. tag Ollama) |
|
||||
| `model_url` | non | URL du serveur modèle (ex. Ollama `http://127.0.0.1:11434`) |
|
||||
| `extraction_passes` | non | Passes d’extraction |
|
||||
| `max_workers` | non | Parallélisme |
|
||||
| `max_char_buffer` | non | Taille tampon caractères |
|
||||
| `api_key` | non | Clé cloud (sinon `LANGEXTRACT_API_KEY` en env) |
|
||||
| `fence_output` | non | Option LangExtract |
|
||||
| `use_schema_constraints` | non | Option LangExtract |
|
||||
|
||||
**Élément `examples`**
|
||||
|
||||
Chaque entrée : `{ "text": "...", "extractions": [ { "extraction_class", "extraction_text", "attributes": {} } ] }`.
|
||||
|
||||
**Réponse `200`**
|
||||
|
||||
```json
|
||||
{
|
||||
"documents": [
|
||||
{
|
||||
"extractions": [
|
||||
{
|
||||
"extraction_class": "...",
|
||||
"extraction_text": "...",
|
||||
"attributes": {},
|
||||
"char_interval": { "start": 0, "end": 0 }
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
`char_interval` est présent lorsque le moteur le fournit.
|
||||
|
||||
**Erreurs**
|
||||
|
||||
- `400` : corps invalide ou exception LangExtract (`detail` message texte)
|
||||
- `401` : Bearer attendu mais absent ou incorrect (si token service configuré)
|
||||
|
||||
## Variables d’environnement
|
||||
|
||||
| Variable | Obligatoire | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `LANGEXTRACT_SERVICE_TOKEN` | non | Si défini, protège `/extract` |
|
||||
| `LANGEXTRACT_API_HOST` | non | Bind |
|
||||
| `LANGEXTRACT_API_PORT` | non | Port |
|
||||
| `LANGEXTRACT_API_KEY` | non | Clé par défaut pour modèles cloud si le client n’envoie pas `api_key` |
|
||||
115
docs/API/local-office.md
Normal file
115
docs/API/local-office.md
Normal file
@ -0,0 +1,115 @@
|
||||
# API — local-office
|
||||
|
||||
Service FastAPI : gestion de fichiers Office (upload, liste, métadonnées, téléchargement, commandes sur **docx**, suppression). Auth par **clé API**, pas Bearer.
|
||||
|
||||
- **Code** : [`services/local-office/`](../../services/local-office/)
|
||||
- **Doc fonctionnelle** : [features/local-office.md](../features/local-office.md)
|
||||
- **OpenAPI** : `GET http://<host>:<port>/docs` (ex. port `8000` en run local)
|
||||
|
||||
## Authentification
|
||||
|
||||
Toutes les routes documentées ici exigent :
|
||||
|
||||
```http
|
||||
X-API-Key: <une des clés listées dans API_KEYS>
|
||||
```
|
||||
|
||||
Les clés sont définies côté serveur (variable `API_KEYS`, liste séparée par virgules). Chaque document est associé à la clé qui l’a créé ; accès aux ressources d’un autre propriétaire renvoie **404**.
|
||||
|
||||
**Rate limiting** : requêtes limitées par clé (slowapi, voir `RATE_LIMIT_PER_MINUTE`).
|
||||
|
||||
## Préfixe des routes
|
||||
|
||||
Le routeur est monté sous **`/documents`** (pas de préfixe `/api`).
|
||||
|
||||
## Endpoints
|
||||
|
||||
### `POST /documents`
|
||||
|
||||
Upload **multipart** d’un fichier Office.
|
||||
|
||||
- **En-têtes** : `X-API-Key` ; pour la partie fichier, `Content-Type` doit être l’un des types autorisés :
|
||||
- `application/vnd.openxmlformats-officedocument.wordprocessingml.document` (docx)
|
||||
- `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet` (xlsx)
|
||||
- `application/vnd.openxmlformats-officedocument.presentationml.presentation` (pptx)
|
||||
|
||||
**Réponse `201`**
|
||||
|
||||
```json
|
||||
{
|
||||
"document_id": "string",
|
||||
"name": "string",
|
||||
"mime_type": "string",
|
||||
"size": number
|
||||
}
|
||||
```
|
||||
|
||||
**Erreurs** : `400` type non supporté ; `413` fichier trop volumineux (`MAX_UPLOAD_BYTES`).
|
||||
|
||||
### `GET /documents`
|
||||
|
||||
Liste les documents de la clé.
|
||||
|
||||
**Réponse `200`** : tableau d’objets métadonnées (structure définie par le stockage SQLite — champs typiques : id, nom, mime, taille, dates).
|
||||
|
||||
### `GET /documents/{document_id}`
|
||||
|
||||
Métadonnées d’un document (propriétaire = clé courante).
|
||||
|
||||
**Erreurs** : `404` si absent ou non propriétaire.
|
||||
|
||||
### `GET /documents/{document_id}/file`
|
||||
|
||||
Téléchargement du fichier binaire ; `Content-Type` et nom de fichier alignés sur les métadonnées.
|
||||
|
||||
### `POST /documents/{document_id}/commands`
|
||||
|
||||
Applique une liste de commandes au contenu. **Implémenté pour docx uniquement** ; xlsx/pptx renvoient **400** (« not implemented yet »).
|
||||
|
||||
**Corps JSON**
|
||||
|
||||
```json
|
||||
{
|
||||
"commands": [
|
||||
{
|
||||
"type": "replaceText",
|
||||
"search": "texte à chercher",
|
||||
"replace": "remplacement"
|
||||
},
|
||||
{
|
||||
"type": "insertParagraph",
|
||||
"text": "nouveau paragraphe",
|
||||
"position": "end"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
| `type` | Champs | Description |
|
||||
|--------|--------|-------------|
|
||||
| `replaceText` | `search` (non vide), `replace` | Première occurrence remplacée dans paragraphes / tableaux |
|
||||
| `insertParagraph` | `text`, `position` optionnel `end` (défaut) ou `start` | Insère un paragraphe |
|
||||
|
||||
**Réponse `200`** : `{ "document_id", "size" }` (taille après écriture).
|
||||
|
||||
**Erreurs** : `400` commande ou MIME invalide ; `404` document absent ou autre clé.
|
||||
|
||||
### `DELETE /documents/{document_id}`
|
||||
|
||||
Supprime métadonnées et fichier.
|
||||
|
||||
**Réponse `204`** sans corps.
|
||||
|
||||
**Erreurs** : `404` si absent ou non propriétaire.
|
||||
|
||||
## Variables d’environnement (rappel)
|
||||
|
||||
| Variable | Rôle |
|
||||
|----------|------|
|
||||
| `API_KEYS` | Liste de clés autorisées (obligatoire en production) |
|
||||
| `STORAGE_PATH` | Fichiers sur disque |
|
||||
| `DATABASE_PATH` | SQLite métadonnées |
|
||||
| `MAX_UPLOAD_BYTES` | Taille max upload |
|
||||
| `RATE_LIMIT_PER_MINUTE` | Plafond requêtes / minute / clé |
|
||||
|
||||
Voir [`services/local-office/.env.example`](../../services/local-office/.env.example).
|
||||
53
docs/API/orchestrator.md
Normal file
53
docs/API/orchestrator.md
Normal file
@ -0,0 +1,53 @@
|
||||
# API — smart_ide-orchestrator (spécification)
|
||||
|
||||
Service prévu : routage des intentions vers Ollama, AnythingLLM, micro-services et [ia-dev-gateway](./ia-dev-gateway.md). **Bearer** : `ORCHESTRATOR_TOKEN`. Défaut **`127.0.0.1:37145`**.
|
||||
|
||||
## `GET /health`
|
||||
|
||||
**200** : `{ "status": "ok" }` — sans authentification.
|
||||
|
||||
## `POST /v1/route`
|
||||
|
||||
Résout une intention sans nécessairement l’exécuter (si `dryRun: true`).
|
||||
|
||||
**Corps JSON**
|
||||
|
||||
| Champ | Obligatoire | Description |
|
||||
|-------|-------------|-------------|
|
||||
| `intent` | oui | Identifiant stable (`code.complete`, `rag.query`, `agent.run`, …) |
|
||||
| `context` | non | Objet libre (fichiers ouverts, sélection, etc.) |
|
||||
| `projectId` | non | Projet `ia_dev` / workspace |
|
||||
| `env` | non | `test` \| `pprod` \| `prod` |
|
||||
| `dryRun` | non | Si `true`, retourne seulement la résolution |
|
||||
|
||||
**200** :
|
||||
|
||||
```json
|
||||
{
|
||||
"resolved": true,
|
||||
"target": "ollama | anythingllm | service | ia_dev",
|
||||
"action": "string",
|
||||
"upstream": { "method": "POST", "url": "relative or absolute", "headersHint": [] }
|
||||
}
|
||||
```
|
||||
|
||||
**200** avec `resolved: false` et `reason` si intention inconnue ou refus policy.
|
||||
|
||||
**401** : Bearer invalide.
|
||||
|
||||
## `POST /v1/execute`
|
||||
|
||||
Exécute la résolution (ou accepte un corps identique à `/v1/route` avec `dryRun: false`). Peut enchaîner appels HTTP vers les services internes. Les réponses pass-through dépendent de la cible.
|
||||
|
||||
**422** : paramètres incohérents.
|
||||
|
||||
## `GET /v1/timeline`
|
||||
|
||||
**200** : `{ "items": [ { "at", "type", "summary", "runId"?, "projectId"? } ] }` — agrégat léger pour l’UI (implémentation ultérieure branchée sur logs / DB).
|
||||
|
||||
**401** si protégé comme les autres routes métier.
|
||||
|
||||
## Notes
|
||||
|
||||
- CORS : à configurer au reverse proxy pour le **front web** uniquement ; pas d’exposition publique sans TLS.
|
||||
- Versionnement : préfixe `/v1/`.
|
||||
78
docs/API/repos-devtools-server.md
Normal file
78
docs/API/repos-devtools-server.md
Normal file
@ -0,0 +1,78 @@
|
||||
# API — repos-devtools-server
|
||||
|
||||
Service Node (HTTP brut) : opérations Git limitées sous une racine configurable.
|
||||
|
||||
- **Code** : [`services/repos-devtools-server/`](../../services/repos-devtools-server/)
|
||||
- **Bind** : `REPOS_DEVTOOLS_HOST` (défaut `127.0.0.1`)
|
||||
- **Port** : `REPOS_DEVTOOLS_PORT` (défaut `37140`)
|
||||
|
||||
## Authentification
|
||||
|
||||
Toutes les routes exigent :
|
||||
|
||||
```http
|
||||
Authorization: Bearer <REPOS_DEVTOOLS_TOKEN>
|
||||
```
|
||||
|
||||
`REPOS_DEVTOOLS_TOKEN` est obligatoire au démarrage (non vide).
|
||||
|
||||
## Endpoints
|
||||
|
||||
### `POST /repos-clone`
|
||||
|
||||
Clone un dépôt sous `REPOS_DEVTOOLS_ROOT`.
|
||||
|
||||
**Corps JSON**
|
||||
|
||||
| Champ | Obligatoire | Description |
|
||||
|-------|-------------|-------------|
|
||||
| `url` | oui | URL Git du dépôt à cloner |
|
||||
| `branch` | non | Branche à cloner (défaut `test`), clone `--single-branch` |
|
||||
|
||||
**Réponses**
|
||||
|
||||
- `200` : `{ "ok": true, "name", "path", "branch", "url", "fourNkAiIgnoreTemplateWrote": boolean }` — si le clone réussit ; si le dépôt n’a pas de `.4nkaiignore`, un gabarit peut être copié (`fourNkAiIgnoreTemplateWrote`).
|
||||
- `409` : répertoire cible déjà présent — `{ "error", "name", "path" }`
|
||||
- `500` : échec `git clone` ou échec après clone lors de l’écriture du template `.4nkaiignore`
|
||||
|
||||
### `GET /repos-list`
|
||||
|
||||
Liste les sous-répertoires de `REPOS_DEVTOOLS_ROOT` qui sont des dépôts Git.
|
||||
|
||||
**Réponse `200`**
|
||||
|
||||
```json
|
||||
{
|
||||
"repos": [{ "name": "string", "path": "string" }],
|
||||
"codeRoot": "string"
|
||||
}
|
||||
```
|
||||
|
||||
### `POST /repos-load`
|
||||
|
||||
Vérifie qu’un dossier nommé existe sous la racine et est un dépôt Git.
|
||||
|
||||
**Corps JSON**
|
||||
|
||||
| Champ | Obligatoire | Description |
|
||||
|-------|-------------|-------------|
|
||||
| `name` | oui | Nom du dossier (sous `REPOS_DEVTOOLS_ROOT`) |
|
||||
|
||||
**Réponses**
|
||||
|
||||
- `200` : `{ "ok": true, "name", "path" }`
|
||||
- `404` : dossier absent
|
||||
- `400` : dossier présent mais pas un dépôt Git
|
||||
|
||||
### Autres chemins
|
||||
|
||||
`404` JSON `{ "error": "Not found" }`.
|
||||
|
||||
## Variables d’environnement
|
||||
|
||||
| Variable | Obligatoire | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `REPOS_DEVTOOLS_TOKEN` | oui | Secret Bearer |
|
||||
| `REPOS_DEVTOOLS_ROOT` | non | Racine des clones (défaut `/home/ncantu/code`) |
|
||||
| `REPOS_DEVTOOLS_HOST` | non | Adresse d’écoute |
|
||||
| `REPOS_DEVTOOLS_PORT` | non | Port |
|
||||
@ -1,31 +1,62 @@
|
||||
# smart_ide — documentation
|
||||
# Documentation technique — smart_ide
|
||||
|
||||
Operational, architectural, and UX-design notes for the local-AI IDE initiative and the host tooling in this repository.
|
||||
Index des documents à la racine de `docs/`. Les **fonctionnalités** détaillées sont dans [`features/`](./features/).
|
||||
|
||||
| Document | Content |
|
||||
## Architecture et déploiement
|
||||
|
||||
| Document | Contenu |
|
||||
|----------|---------|
|
||||
| [../README.md](../README.md) | Project overview (French): vision, Lapce, AnythingLLM per project |
|
||||
| [deployment-target.md](./deployment-target.md) | First target: Linux client + SSH remote server (AI stack + repos) |
|
||||
| [ia_dev-submodule.md](./ia_dev-submodule.md) | Git submodule `ia_dev` (clone, update, SSH URL) |
|
||||
| [lecoffre_ng-checkout.md](./lecoffre_ng-checkout.md) | Plain clone `lecoffre_ng` next to `smart_ide` (`/home/ncantu/code`) |
|
||||
| [split-lecoffre-repos.md](./split-lecoffre-repos.md) | Split monorepo into five Gitea repos (`setup/split-lecoffre-ng-to-five-repos.sh`) |
|
||||
| [infrastructure.md](./infrastructure.md) | Host inventory (LAN), SSH key workflow, host scripts |
|
||||
| [services.md](./services.md) | Ollama, AnythingLLM (Docker), Desktop installer, Ollama ↔ Docker |
|
||||
| [../deploy/nginx/README-ia-enso.md](../deploy/nginx/README-ia-enso.md) | Proxy HTTPS `ia.enso.4nkweb.com` → Ollama / AnythingLLM (Bearer, script SSH, dépannage) |
|
||||
| [../extensions/anythingllm-workspaces/README.md](../extensions/anythingllm-workspaces/README.md) | Extension VS Code / Cursor : lister les workspaces AnythingLLM (API) et ouvrir l’UI |
|
||||
| [features/anythingllm-vscode-extension.md](./features/anythingllm-vscode-extension.md) | Fiche évolution : extension AnythingLLM, impacts, modalités |
|
||||
| [features/repos-devtools-server-and-dev-panel.md](./features/repos-devtools-server-and-dev-panel.md) | API locale repos + panneau dev tools (clone, workspace AnythingLLM) |
|
||||
| [../services/repos-devtools-server/README.md](../services/repos-devtools-server/README.md) | Serveur HTTP local : clone/list/load sous `REPOS_DEVTOOLS_ROOT` |
|
||||
| [fixKnowledge/anythingllm-extension-403-api-key.md](./fixKnowledge/anythingllm-extension-403-api-key.md) | 403 API AnythingLLM : clé nginx Ollama vs clé UI API Keys |
|
||||
| [features/ia-enso-nginx-proxy-ollama-anythingllm.md](./features/ia-enso-nginx-proxy-ollama-anythingllm.md) | Fiche évolution : objectifs, impacts, modalités du reverse proxy ia.enso |
|
||||
| [anythingllm-workspaces.md](./anythingllm-workspaces.md) | One AnythingLLM workspace per project; sync pipeline |
|
||||
| [ux-navigation-model.md](./ux-navigation-model.md) | Beyond file explorer: intentions, graph, palette, risks, expert mode |
|
||||
| [system-architecture.md](./system-architecture.md) | Layers, modules, agent gateway, OpenShell, events, Lapce |
|
||||
| [platform-target.md](./platform-target.md) | Vision plateforme en ligne, 3 envs, machine IA unique vs SSH, SSO, navigateur optionnel |
|
||||
| [implementation-rollout.md](./implementation-rollout.md) | Déroulé du plan plateforme : doc + code minimal, suites |
|
||||
| [system-architecture.md](./system-architecture.md) | Couches, monorepo, cartographie des dossiers, gateway, OpenShell, micro-services |
|
||||
| [core-ide.md](./core-ide.md) | Socle applicatif Lapce : `core_ide/`, clone amont, build |
|
||||
| [deployment-target.md](./deployment-target.md) | Client Linux + SSH, variante machine IA unique, serveur socle IA et repos |
|
||||
| [infrastructure.md](./infrastructure.md) | SSH, accès hôte, renvois vers les scripts |
|
||||
| [services.md](./services.md) | Ollama, AnythingLLM, **Local Office**, micro-services HTTP sous `services/` |
|
||||
|
||||
**Author:** 4NK
|
||||
## Référence API des services (`API/`)
|
||||
|
||||
**Related external docs**
|
||||
| Document | Contenu |
|
||||
|----------|---------|
|
||||
| [API/README.md](./API/README.md) | Index : auth, ports, liens vers chaque service |
|
||||
| [API/repos-devtools-server.md](./API/repos-devtools-server.md) | Clone / liste / load de dépôts Git |
|
||||
| [API/langextract-api.md](./API/langextract-api.md) | Extraction structurée (LangExtract) |
|
||||
| [API/claw-harness-proxy.md](./API/claw-harness-proxy.md) | Proxy HTTP vers serveur claw-code |
|
||||
| [API/agent-regex-search-api.md](./API/agent-regex-search-api.md) | Recherche regex fichiers (ripgrep) |
|
||||
| [API/local-office.md](./API/local-office.md) | Documents Office (upload, commandes docx) |
|
||||
| [API/ia-dev-gateway.md](./API/ia-dev-gateway.md) | Gateway `ia_dev` — agents, runs, SSE (spécification) |
|
||||
| [API/orchestrator.md](./API/orchestrator.md) | Orchestrateur intentions — routage (spécification) |
|
||||
|
||||
- AnythingLLM Docker: <https://docs.anythingllm.com/installation-docker/local-docker>
|
||||
- Ollama: <https://github.com/ollama/ollama/blob/main/docs/linux.md>
|
||||
- Lapce: <https://lapce.dev/>
|
||||
## Workspaces et IDE
|
||||
|
||||
| Document | Contenu |
|
||||
|----------|---------|
|
||||
| [anythingllm-workspaces.md](./anythingllm-workspaces.md) | Un workspace AnythingLLM par projet, synchronisation |
|
||||
| [ux-navigation-model.md](./ux-navigation-model.md) | Intentions, recherche, mode expert |
|
||||
|
||||
## Intégration dépôts
|
||||
|
||||
| Document | Contenu |
|
||||
|----------|---------|
|
||||
| [ia_dev-submodule.md](./ia_dev-submodule.md) | Sous-module `ia_dev` (forge 4NK), agents et `projects/<id>/` |
|
||||
|
||||
## Fonctionnalités (`features/`)
|
||||
|
||||
| Document | Contenu |
|
||||
|----------|---------|
|
||||
| [features/local-office.md](./features/local-office.md) | **Local Office** — API REST Office dans `services/local-office/` |
|
||||
| [features/langextract-api.md](./features/langextract-api.md) | API locale LangExtract |
|
||||
| [features/claw-harness-api.md](./features/claw-harness-api.md) | Harnais claw-code, proxy |
|
||||
| [features/agent-regex-search-api.md](./features/agent-regex-search-api.md) | Recherche regex code (ripgrep) |
|
||||
| [features/anythingllm-pull-sync-after-pull.md](./features/anythingllm-pull-sync-after-pull.md) | Synchro AnythingLLM après pull |
|
||||
| [features/initial-rag-sync-4nkaiignore.md](./features/initial-rag-sync-4nkaiignore.md) | RAG initial et `.4nkaiignore` |
|
||||
| [features/ia-dev-service.md](./features/ia-dev-service.md) | Service `ia-dev-gateway`, fork `ia_dev`, migration |
|
||||
| [features/orchestrator-api.md](./features/orchestrator-api.md) | Contrat HTTP orchestrateur (Ollama, ALLM, services) |
|
||||
| [features/lapce-porting-roadmap.md](./features/lapce-porting-roadmap.md) | Phases portage extension AnythingLLM → Lapce |
|
||||
| [features/sso-docv-enso.md](./features/sso-docv-enso.md) | OIDC front ↔ docv (Enso) |
|
||||
| [features/browser-automation-criteria.md](./features/browser-automation-criteria.md) | Critères service navigateur optionnel |
|
||||
|
||||
## Arborescence hors `docs/`
|
||||
|
||||
- **Code Local Office** : [`../services/local-office/README.md`](../services/local-office/README.md) (référence opérationnelle, variables, OpenAPI).
|
||||
- **Micro-services Node/Python** : [`../services/`](../services/) (README par service).
|
||||
|
||||
49
docs/core-ide.md
Normal file
49
docs/core-ide.md
Normal file
@ -0,0 +1,49 @@
|
||||
# Socle applicatif — `core_ide/` (Lapce)
|
||||
|
||||
Le répertoire **`core_ide/`** à la racine du clone `smart_ide` contient le **clone Git** de l’éditeur [Lapce](https://lapce.dev/) (amont public [lapce/lapce](https://github.com/lapce/lapce), Apache-2.0). C’est le **socle applicatif** visé pour l’IDE : build, extensions et personnalisations 4NK s’appuient sur cet arbre.
|
||||
|
||||
- Le contenu de **`core_ide/`** est **exclu de l’index Git** du dépôt parent (`.gitignore` à la racine) pour limiter la taille du monorepo ; il reste présent localement ou sur la machine de build.
|
||||
- Ce document est la **référence versionnée** pour l’emplacement et la mise à jour du clone (le dépôt Lapce amont fournit son propre `README.md` à la racine du clone).
|
||||
|
||||
## Mettre à jour les sources amont
|
||||
|
||||
Sans créer de dépôt produit 4NK sur GitHub : conserver `origin` pointant vers l’URL publique de Lapce (ou un remote `upstream` si besoin), puis tirer les branches nécessaires :
|
||||
|
||||
```bash
|
||||
cd core_ide
|
||||
git fetch origin
|
||||
git merge origin/master
|
||||
```
|
||||
|
||||
(Remplacer `master` par la branche par défaut du dépôt amont si elle change.)
|
||||
|
||||
### Historique complet (clone shallow)
|
||||
|
||||
Si le clone a été fait avec `--depth 1` :
|
||||
|
||||
```bash
|
||||
cd core_ide
|
||||
git fetch --unshallow
|
||||
```
|
||||
|
||||
### Build
|
||||
|
||||
Suivre la documentation amont Lapce (workspace Rust à la racine de `core_ide/`). Le binaire produit alimente la couche **editor-shell** décrite dans [system-architecture.md](./system-architecture.md).
|
||||
|
||||
### Premier checkout
|
||||
|
||||
```bash
|
||||
cd /chemin/vers/smart_ide
|
||||
git clone https://github.com/lapce/lapce.git core_ide
|
||||
```
|
||||
|
||||
(Ou l’URL / remote interne retenu par l’équipe ; SSH si configuré.)
|
||||
|
||||
### Migration depuis l’ancien emplacement
|
||||
|
||||
Si un clone Lapce existait sous `forks/lapce/`, le renommer une fois :
|
||||
|
||||
```bash
|
||||
mv forks/lapce core_ide
|
||||
rmdir forks 2>/dev/null || true
|
||||
```
|
||||
@ -1,13 +1,19 @@
|
||||
# Première cible de déploiement — client Linux + serveur distant (SSH)
|
||||
|
||||
## Variante : machine IA unique
|
||||
|
||||
Dans cette variante, **Ollama**, **AnythingLLM** et les **services** `smart_ide` tournent sur le **même hôte**. Les URLs `127.0.0.1` pour l’inférence et le RAG sont **locales à cette machine** ; Lapce et/ou le **front web** sur la même machine ou derrière le même reverse proxy les consomment **sans tunnel SSH**. Les **trois environnements** (test, pprod, prod) restent séparés par configuration et DNS — voir [platform-target.md](./platform-target.md). L’**orchestrateur** et **`ia-dev-gateway`** peuvent cohabiter sur cet hôte.
|
||||
|
||||
Cette variante **ne remplace pas** le modèle client/serveur SSH : elle le complète pour les postes ou farms « tout-en-un ».
|
||||
|
||||
## Modèle
|
||||
|
||||
La **première cible de déploiement** n’est pas un poste tout-en-un sur la même machine que le socle IA.
|
||||
La **première cible de déploiement** décrite ci-dessous n’est pas un poste tout-en-un sur la même machine que le socle IA.
|
||||
|
||||
| Rôle | Où ça tourne | Contenu typique |
|
||||
|------|----------------|-----------------|
|
||||
| **Client** | Machine **Linux** de l’utilisateur (poste local) | Shell d’édition / UX (ex. Lapce), orchestrateur côté client si applicable, connexion **SSH** persistante ou à la demande |
|
||||
| **Serveur distant** | Hôte joignable en **SSH** (LAN, bastion, ou jump host selon l’infra) | **Socle technique IA** (Ollama, AnythingLLM Docker, services associés), **clones des dépôts**, exécution des **agents** / scripts / OpenShell sur le périmètre autorisé |
|
||||
| **Serveur distant** | Hôte joignable en **SSH** (LAN, bastion, ou jump host selon l’infra) | **Socle technique IA** (Ollama, AnythingLLM Docker, services associés), **clones des dépôts**, exécution des **agents** / scripts / OpenShell sur le périmètre autorisé ; **Local Office** ([`services/local-office/`](../services/local-office/), API fichiers Office programmatique) si déployé |
|
||||
|
||||
L’utilisateur travaille depuis un **Linux client** ; le **calcul**, les **modèles**, la **mémoire RAG** et les **sources de vérité Git** résident sur le **serveur** (ou une ferme de serveurs derrière la même session SSH).
|
||||
|
||||
@ -16,10 +22,13 @@ L’utilisateur travaille depuis un **Linux client** ; le **calcul**, les **mod
|
||||
- Les URLs « locales » du serveur (`localhost:11434`, `localhost:3001`, …) sont **locales au serveur**. Depuis le client, l’accès passe par **tunnel SSH** (`-L`), **ProxyJump**, ou configuration explicite (hostname interne, VPN) selon la politique réseau.
|
||||
- L’**agent gateway** et le **policy-runtime** (OpenShell) s’exécutent idéalement **là où tournent les agents et les repos** — le serveur — sauf décision contraire documentée.
|
||||
- Le **workspace AnythingLLM par projet** vit **côté serveur** (stockage du conteneur ou chemin monté sur l’hôte distant). La moulinette de synchro lit les **dépôts sur le serveur**.
|
||||
- **Local Office** : données sous `services/local-office/data/` (ou chemins surchargés par `STORAGE_PATH` / `DATABASE_PATH`) sur l’**hôte qui exécute l’API** ; à sauvegarder et à protéger comme toute instance de fichiers métier.
|
||||
- Le client doit disposer d’une **identité SSH** autorisée sur le serveur (voir `add-ssh-key.sh` et [infrastructure.md](./infrastructure.md)).
|
||||
|
||||
## Documentation liée
|
||||
|
||||
- Vision produit et envs : [platform-target.md](./platform-target.md)
|
||||
- Topologie LAN / bastion : [infrastructure.md](./infrastructure.md)
|
||||
- Services Ollama / AnythingLLM sur l’hôte qui **héberge** le socle : [services.md](./services.md)
|
||||
- Services Ollama / AnythingLLM / Local Office sur l’hôte qui **héberge** le socle : [services.md](./services.md)
|
||||
- Répartition logique des modules : [system-architecture.md](./system-architecture.md) (à lire avec ce découpage physique)
|
||||
- SSO front / docv : [features/sso-docv-enso.md](./features/sso-docv-enso.md)
|
||||
|
||||
32
docs/features/agent-regex-search-api.md
Normal file
32
docs/features/agent-regex-search-api.md
Normal file
@ -0,0 +1,32 @@
|
||||
# Recherche regex sur code — API locale (`services/agent-regex-search-api`)
|
||||
|
||||
## Objectif
|
||||
|
||||
Offrir aux clients locaux (futur shell Lapce, gateway, agents) une **API HTTP** pour exécuter des **recherches par expression régulière** sur une arborescence contrôlée, sans dépendre du moteur propriétaire décrit dans le billet Cursor [Recherche regex rapide : indexer le texte pour les outils des agents](https://cursor.com/fr/blog/fast-regex-search).
|
||||
|
||||
## Ce que ce n’est pas
|
||||
|
||||
L’article Cursor décrit des index **sparse n-grams**, fichiers sur disque, `mmap`, etc. **Ce code n’est pas reproduit ici** : Cursor ne publie pas ce moteur en open source. Le service `agent-regex-search-api` s’appuie sur **[ripgrep](https://github.com/BurntSushi/ripgrep)** (`rg`), outil standard, rapide, et adapté aux flux « agent » qui enchaînent beaucoup de recherches.
|
||||
|
||||
## Périmètre fonctionnel
|
||||
|
||||
| Élément | Détail |
|
||||
|--------|--------|
|
||||
| Code | [services/agent-regex-search-api/README.md](../../services/agent-regex-search-api/README.md) |
|
||||
| Moteur | `rg --json` ; prérequis : binaire `rg` dans `PATH` |
|
||||
| Confinement | `REGEX_SEARCH_ROOT` (défaut `/home/ncantu/code`) ; `subpath` uniquement **relatif**, sans `..` |
|
||||
| Auth | `REGEX_SEARCH_TOKEN` → `Authorization: Bearer …` sur `POST /search` |
|
||||
| Port défaut | `37143` |
|
||||
|
||||
## Menaces à prendre en compte
|
||||
|
||||
- **ReDoS** : une regex peut rester coûteuse jusqu’à `timeoutMs` ; garder des plafonds raisonnables.
|
||||
- **Lecture disque** : tout fichier que `rg` traverse sous la cible peut être lu selon les droits OS ; aligner `REGEX_SEARCH_ROOT` sur la politique du poste.
|
||||
|
||||
## Évolutions possibles (hors périmètre initial)
|
||||
|
||||
Pour des monorepos extrêmement volumineux, des backends **indexés** open source (ex. **Zoekt**, familles d’index **trigram** / n-grams) peuvent compléter ou remplacer le seul `rg`, en réutilisant les idées du billet Cursor comme **références algorithmiques**, pas comme implémentation fournie.
|
||||
|
||||
## Intégration architecture
|
||||
|
||||
Voir [system-architecture.md](../system-architecture.md) : ce service est un **micro-service HTTP local** dans la même famille que `repos-devtools-server`, destiné à être appelé par l’orchestrateur ou l’éditeur plutôt que par des clients distants non authentifiés.
|
||||
33
docs/features/anythingllm-pull-sync-after-pull.md
Normal file
33
docs/features/anythingllm-pull-sync-after-pull.md
Normal file
@ -0,0 +1,33 @@
|
||||
# AnythingLLM — synchronisation après `git pull`
|
||||
|
||||
## Objectif
|
||||
|
||||
Déclencher un envoi vers AnythingLLM des fichiers **modifiés ou ajoutés** par un `git pull` (merge fast-forward ou merge classique), sans action manuelle dans l’éditeur.
|
||||
|
||||
## Impacts
|
||||
|
||||
- Chaque dépôt concerné peut installer un hook Git **`post-merge`** qui appelle `scripts/anythingllm-pull-sync/sync.mjs`.
|
||||
- Les mêmes exclusions que **`.4nkaiignore`** (et quelques motifs système) s’appliquent.
|
||||
- Les suppressions ou renommages ne sont pas reflétés comme suppressions côté AnythingLLM dans cette version (upload uniquement).
|
||||
|
||||
## Modifications (dépôt smart_ide)
|
||||
|
||||
- `scripts/anythingllm-pull-sync/` : script Node (ESM), dépendance `ignore`, `package.json`, `README.md`.
|
||||
- `scripts/install-anythingllm-post-merge-hook.sh` : pose le hook dans `.git/hooks/post-merge` avec le chemin absolu vers `sync.mjs`.
|
||||
|
||||
## Configuration par dépôt
|
||||
|
||||
- Fichier optionnel **`.anythingllm.json`** à la racine : `{ "workspaceSlug": "<slug>" }`.
|
||||
- Ou variable d’environnement **`ANYTHINGLLM_WORKSPACE_SLUG`** (priorité documentée dans le README du script).
|
||||
|
||||
## Modalités de déploiement
|
||||
|
||||
1. Sur la machine de développement : `npm install` dans `scripts/anythingllm-pull-sync`.
|
||||
2. Créer `~/.config/4nk/anythingllm-sync.env` avec `ANYTHINGLLM_BASE_URL` et `ANYTHINGLLM_API_KEY` (ne pas commiter la clé).
|
||||
3. Exécuter `install-anythingllm-post-merge-hook.sh <chemin-du-repo>` pour chaque dépôt à synchroniser.
|
||||
4. S’assurer qu’AnythingLLM (collector) est joignable depuis cette machine.
|
||||
|
||||
## Modalités d’analyse
|
||||
|
||||
- Messages sur **stderr** : `uploaded=`, `skipped=`, `errors=`, détail des erreurs d’upload (tronqué au-delà de 20 lignes).
|
||||
- Si `ORIG_HEAD` est absent, ou si URL / clé / slug manquent : message explicite et **code de sortie 0** pour ne pas bloquer le pull.
|
||||
@ -1,29 +0,0 @@
|
||||
# AnythingLLM workspaces — extension VS Code / Cursor
|
||||
|
||||
**Author:** 4NK
|
||||
|
||||
## Objectif
|
||||
|
||||
Fournir un point d’entrée minimal dans l’éditeur pour lister les **workspaces AnythingLLM** via l’API développeur (`GET /api/v1/workspaces`) et ouvrir l’interface web du workspace sélectionné, en s’appuyant sur l’URL publique documentée pour **ia.enso** (`/anythingllm/`).
|
||||
|
||||
## Impacts
|
||||
|
||||
- Nouveau répertoire : `extensions/anythingllm-workspaces/` (extension autonome, non publiée sur le marketplace par défaut).
|
||||
- Aucun impact sur le déploiement nginx ni sur les services Docker tant que seuls les paramètres utilisateur (`baseUrl`, `apiKey`) sont renseignés côté poste développeur.
|
||||
|
||||
## Modifications
|
||||
|
||||
- `package.json`, `tsconfig.json`, sources TypeScript (`src/extension.ts`, `src/anythingllmClient.ts`, `src/types.ts`).
|
||||
- `README.md` de l’extension : prérequis, configuration, commandes, lien vers `deploy/nginx/README-ia-enso.md`.
|
||||
- Évolutions ultérieures (v0.2.0) : panneau dev tools, client `repos-devtools-server`, `POST /api/v1/workspace/new` — voir [repos-devtools-server-and-dev-panel.md](./repos-devtools-server-and-dev-panel.md).
|
||||
|
||||
## Modalités de déploiement
|
||||
|
||||
- Développement : ouvrir le dossier `extensions/anythingllm-workspaces` dans VS Code / Cursor, `npm install`, `npm run compile`, lancer **Run Extension**.
|
||||
- Distribution interne : `vsce package` après installation de `@vscode/vsce` si besoin, installation du `.vsix` sur les postes cibles.
|
||||
|
||||
## Modalités d’analyse
|
||||
|
||||
- En cas d’échec : lire le message d’erreur affiché par la commande (statut HTTP et extrait du corps).
|
||||
- Vérifier côté proxy que `anythingllm.baseUrl` correspond au chemin public (sans slash final) et que la clé API est valide dans l’UI AnythingLLM.
|
||||
- Référence API amont : Mintplex-Labs anything-llm, `server/endpoints/api/workspace/index.js` (`GET /v1/workspaces` sous préfixe `/api`).
|
||||
28
docs/features/browser-automation-criteria.md
Normal file
28
docs/features/browser-automation-criteria.md
Normal file
@ -0,0 +1,28 @@
|
||||
# Critères d’introduction d’un service `browser-automation-api`
|
||||
|
||||
## Position par défaut
|
||||
|
||||
La plateforme **n’intègre pas** Chromium / Playwright / équivalent dans les services tant que les besoins ci-dessous ne sont **pas** satisfaits par le navigateur système ou un onglet web du shell (Lapce, front).
|
||||
|
||||
## Ouvrir le service si **au moins une** condition est vraie
|
||||
|
||||
1. **Capture de rendu** : génération d’images ou PDF de pages **internes** sans interaction utilisateur (rapports, preuves d’état).
|
||||
2. **E2E pilotés par agents** : scénarios web reproductibles avec **timeouts** et **allowlist** de domaines.
|
||||
3. **Scraping contrôlé** : extraction de contenu depuis URLs **pré-approuvées** uniquement (liste configurée par env).
|
||||
4. **Tests visuels** sur infra d’intégration où le poste développeur **n’a pas** de GUI.
|
||||
|
||||
## Contraintes de conception
|
||||
|
||||
- Processus **séparé** : `services/browser-automation-api/` dédié, pas de dépendance lourde ajoutée aux API existantes (Local Office, repos-devtools, etc.).
|
||||
- **File d’attente** et **plafond** de jobs simultanés ; **timeouts** stricts.
|
||||
- Réseau : **allowlist** ; pas de navigation arbitraire vers Internet.
|
||||
- Auth : Bearer service-to-service ; journalisation des URLs demandées.
|
||||
|
||||
## Hors scope
|
||||
|
||||
- Remplacer le navigateur de l’utilisateur pour l’UI AnythingLLM ou ONLYOFFICE au quotidien.
|
||||
- Automatisation non auditée sans policy.
|
||||
|
||||
## Document lié
|
||||
|
||||
- [platform-target.md](../platform-target.md) — rappel navigateur optionnel
|
||||
33
docs/features/claw-harness-api.md
Normal file
33
docs/features/claw-harness-api.md
Normal file
@ -0,0 +1,33 @@
|
||||
# Claw-code — harnais multi-modèles (`services/claw-harness-api`)
|
||||
|
||||
## Objectif
|
||||
|
||||
Documenter et outiller l’usage du dépôt **claw-code** (runtime type « harness » pour agents, outils, MCP selon les versions amont) dans le périmètre **smart_ide**, avec une **politique sans Anthropic** dans les gabarits fournis ici.
|
||||
|
||||
## Sources amont
|
||||
|
||||
- Page miroir : [gitlawb — claw-code](https://gitlawb.com/node/repos/z6Mks1jg/claw-code)
|
||||
- Dépôt GitHub souvent utilisé pour cloner : [instructkr/claw-code](https://github.com/instructkr/claw-code)
|
||||
|
||||
Le dépôt amont évolue (Rust / Python, binaires, serveur HTTP). Ce dépôt **ne vend pas** claw-code : seulement README, exemple de politique fournisseurs, et un **proxy HTTP** optionnel.
|
||||
|
||||
## Fichiers locaux
|
||||
|
||||
| Fichier / dossier | Rôle |
|
||||
|-------------------|------|
|
||||
| [services/claw-harness-api/README.md](../../services/claw-harness-api/README.md) | Clone, build résumé, variables du proxy |
|
||||
| [services/claw-harness-api/providers.example.yaml](../../services/claw-harness-api/providers.example.yaml) | Exemple : Ollama activé ; **Anthropic `enabled: false`** |
|
||||
| [services/claw-harness-api/proxy/](../../services/claw-harness-api/proxy/) | Proxy `127.0.0.1` + Bearer → URL amont (`CLAW_UPSTREAM_URL`) |
|
||||
|
||||
## Anthropic
|
||||
|
||||
Les gabarits dans `smart_ide` **n’activent pas** Anthropic. Le bloc y figure explicitement avec `enabled: false`. Le contrôle d’accès réseau (pas de résolution / pas de route vers `api.anthropic.com`) et l’absence de secrets côté hôte complètent la politique si vous en avez besoin.
|
||||
|
||||
## Intégration architecture
|
||||
|
||||
Positionnement par rapport à [system-architecture.md](../system-architecture.md) : claw-code joue le rôle d’**exécution harnais** (outils, session, éventuellement MCP) ; le **proxy** homogénéise l’accès (token, bind local) pour un futur client type Lapce ou un gateway maison. Les agents métier `ia_dev` restent le noyau opératoire décrit ailleurs ; claw est un **runtime optionnel** à brancher explicitement.
|
||||
|
||||
## Limites
|
||||
|
||||
- Projet **tiers** ; licence et stabilité suivent l’amont.
|
||||
- Le proxy **relaye** le trafic vers le serveur HTTP claw : il ne remplace pas la lecture des politiques de confidentialité des fournisseurs que vous activez (Ollama local vs API cloud).
|
||||
57
docs/features/ia-dev-service.md
Normal file
57
docs/features/ia-dev-service.md
Normal file
@ -0,0 +1,57 @@
|
||||
# Service `ia-dev-gateway` — exécution agents et déploiements
|
||||
|
||||
## Objectif
|
||||
|
||||
Remplacer à terme l’appel **direct** au dépôt sous-module [`ia_dev`](../ia_dev-submodule.md) par un **service HTTP** sous [`services/ia-dev-gateway/`](../../services/ia-dev-gateway/) qui :
|
||||
|
||||
- Pointe vers un **fork** de [4nk/ia_dev](https://git.4nkweb.com/4nk/ia_dev.git) (même historique Git, gouvernance dans le monorepo `smart_ide`).
|
||||
- **N’implémente pas** la logique métier des projets : il **oriente** les jobs vers `projects/<id>/`, `deploy/`, scripts existants, avec policy et journalisation.
|
||||
- Expose un **registre d’agents** et des **runs** pour Lapce, le front web et l’orchestrateur.
|
||||
|
||||
## Périmètre
|
||||
|
||||
| Inclus | Exclus |
|
||||
|--------|--------|
|
||||
| Auth service-to-service (Bearer) | Duplication des recettes métier dans `smart_ide` |
|
||||
| Soumission de jobs (deploy, agent, script) | Exécution hors sandbox / OpenShell si policy impose un runtime |
|
||||
| Stream d’événements (SSE ou WebSocket) | UI complète (reste Lapce / front) |
|
||||
| Lecture du registre agents depuis le checkout `ia_dev` | Modification des secrets des projets cibles |
|
||||
|
||||
## Cohabitation avec le sous-module
|
||||
|
||||
Aujourd’hui `./ia_dev` reste le **checkout canonique** sur l’hôte. Le binaire `ia-dev-gateway` reçoit `IA_DEV_ROOT` (défaut : répertoire parent du service ou chemin absolu vers `./ia_dev`).
|
||||
|
||||
**Trajectoire** : sous-module conservé jusqu’à ce que le fork soit **vendored** ou **cloné par le service** au déploiement ; puis documentation de migration dans [ia_dev-submodule.md](../ia_dev-submodule.md).
|
||||
|
||||
## API (spécification)
|
||||
|
||||
Référence détaillée : [API/ia-dev-gateway.md](../API/ia-dev-gateway.md).
|
||||
|
||||
Résumé :
|
||||
|
||||
- `GET /health` — liveness.
|
||||
- `GET /v1/agents` — liste des agents enregistrés (métadonnées dérivées du registre `ia_dev`).
|
||||
- `GET /v1/agents/{id}` — descripteur stable (rôle, droits, commandes déclenchantes).
|
||||
- `POST /v1/runs` — corps JSON : `{ "agentId", "projectId", "intent", "payload"?, "env"? }` ; réponse : `{ "runId", "status" }`.
|
||||
- `GET /v1/runs/{runId}` — statut et sortie partielle.
|
||||
- `GET /v1/runs/{runId}/events` — **SSE** (ou upgrade WebSocket selon implémentation) : flux `started`, `tool_selected`, `completed`, `failed`, etc. (aligné [system-architecture.md](../system-architecture.md)).
|
||||
|
||||
Les codes d’erreur **401/403/404/409/422** sont explicites ; pas de fallback silencieux.
|
||||
|
||||
## Variables d’environnement (cible)
|
||||
|
||||
| Variable | Obligatoire | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `IA_DEV_GATEWAY_TOKEN` | oui | Bearer attendu des clients autorisés |
|
||||
| `IA_DEV_GATEWAY_HOST` | non | Bind (défaut `127.0.0.1`) |
|
||||
| `IA_DEV_GATEWAY_PORT` | non | Port (défaut `37144`) |
|
||||
| `IA_DEV_ROOT` | non | Chemin racine du checkout `ia_dev` (fork) |
|
||||
|
||||
## Implémentation
|
||||
|
||||
Le répertoire [`services/ia-dev-gateway/`](../../services/ia-dev-gateway/) contient un **serveur Node/TypeScript** (`npm run build && npm start`) : scan des agents `.md`, runs en mémoire avec statut stub `completed`, flux SSE minimal. Brancher le **runner** réel (`ia_dev` scripts) sur `POST /v1/runs` reste à faire. L’orchestrateur [orchestrator-api.md](./orchestrator-api.md) peut cibler ce service pour `agent.run`.
|
||||
|
||||
## Voir aussi
|
||||
|
||||
- [platform-target.md](../platform-target.md) — trois environnements
|
||||
- [system-architecture.md](../system-architecture.md) — agent gateway, policy
|
||||
@ -1,44 +0,0 @@
|
||||
# Feature: Reverse proxy ia.enso.4nkweb.com for Ollama and AnythingLLM
|
||||
|
||||
**Author:** 4NK team
|
||||
|
||||
## Objective
|
||||
|
||||
Expose Ollama and AnythingLLM on the public proxy hostname with HTTPS, path prefixes `/ollama` and `/anythingllm`, and **gate `/ollama/`** with a **Bearer token** at nginx (compatible with OpenAI clients that send `Authorization: Bearer <key>`). The secret is **not** forwarded to Ollama.
|
||||
|
||||
## Public URLs (HTTPS)
|
||||
|
||||
- AnythingLLM UI: `https://ia.enso.4nkweb.com/anythingllm/`
|
||||
- Ollama native API (example): `https://ia.enso.4nkweb.com/ollama/api/tags` — Bearer required at nginx
|
||||
- OpenAI-compatible base (Cursor): `https://ia.enso.4nkweb.com/ollama/v1`
|
||||
|
||||
## Impacts
|
||||
|
||||
- **Proxy (nginx):** `server_name`, TLS, locations; `conf.d/ia-enso-http-maps.conf` with `map_hash_bucket_size`, Bearer `map`, and WebSocket `map` (or Bearer-only if WebSocket map exists elsewhere).
|
||||
- **Backend (192.168.1.164):** must accept connections from the proxy on `11434` and `3001`.
|
||||
- **Clients:** send `Authorization: Bearer <secret>` for `/ollama/*`; Cursor API key field = same secret as in the nginx `map`.
|
||||
|
||||
## Repository layout
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| `deploy/nginx/sites/ia.enso.4nkweb.com.conf` | `server` blocks ; upstreams use `__IA_ENSO_BACKEND_IP__` |
|
||||
| `deploy/nginx/http-maps/ia-enso-ollama-bearer.map.conf.example` | Bearer `map` reference for manual installs |
|
||||
| `deploy/nginx/http-maps/websocket-connection.map.conf.example` | WebSocket `map` reference |
|
||||
| `deploy/nginx/deploy-ia-enso-to-proxy.sh` | SSH deploy; retry Bearer-only if duplicate WebSocket `map` |
|
||||
| `deploy/nginx/sites/ia.enso.4nkweb.com.http-only.conf` | TLS bootstrap HTTP-only vhost |
|
||||
| `deploy/nginx/README-ia-enso.md` | Operator reference (includes note on Cursor `streamFromAgentBackend`) |
|
||||
|
||||
## Deployment modalities
|
||||
|
||||
Run `./deploy/nginx/deploy-ia-enso-to-proxy.sh` with optional `IA_ENSO_OLLAMA_BEARER_TOKEN`. See `README-ia-enso.md`.
|
||||
|
||||
## Analysis modalities
|
||||
|
||||
- `curl` to `/ollama/v1/models` with and without Bearer (200 / 401).
|
||||
- Browser: `/anythingllm/`.
|
||||
|
||||
## Security notes
|
||||
|
||||
- Bearer secret is equivalent to an API key; rotate in `ia-enso-http-maps.conf` and client configs together.
|
||||
- AnythingLLM uses its own application login on `/anythingllm/`.
|
||||
35
docs/features/langextract-api.md
Normal file
35
docs/features/langextract-api.md
Normal file
@ -0,0 +1,35 @@
|
||||
# LangExtract — API locale (`services/langextract-api`)
|
||||
|
||||
## Objectif
|
||||
|
||||
Exposer [LangExtract](https://github.com/google/langextract) (Google, Apache-2.0) comme **service HTTP local** : à partir d’un texte, d’une consigne et d’exemples few-shot, produire des **extractions structurées** (classes, attributs, texte extrait) avec **ancrage** dans le texte lorsque le modèle et la librairie le fournissent (`char_interval`).
|
||||
|
||||
## Périmètre
|
||||
|
||||
- Pas de logique métier supplémentaire : l’API se limite à valider le JSON, appeler `langextract.extract`, sérialiser le résultat.
|
||||
- Les modèles **cloud** (Gemini, etc.) suivent la configuration amont (clés API, quotas).
|
||||
- Les modèles **locaux** passent typiquement par **Ollama** (`model_url`, options `fence_output` / `use_schema_constraints` selon la doc amont).
|
||||
|
||||
## Exploitation
|
||||
|
||||
| Élément | Détail |
|
||||
|--------|--------|
|
||||
| Code | [services/langextract-api/README.md](../../services/langextract-api/README.md) |
|
||||
| Hôte / port | `127.0.0.1` et port par défaut `37141` (voir README) |
|
||||
| Auth | Si `LANGEXTRACT_SERVICE_TOKEN` est défini : en-tête `Authorization: Bearer …` obligatoire sur `POST /extract` |
|
||||
| Santé | `GET /health` |
|
||||
|
||||
## Schéma de requête (`POST /extract`)
|
||||
|
||||
Champs principaux : `text`, `prompt_description`, `examples[]` (`text`, `extractions[]` avec `extraction_class`, `extraction_text`, `attributes`), `model_id`, et options optionnelles alignées sur l’API Python (`model_url`, `extraction_passes`, `max_workers`, `max_char_buffer`, `api_key`, `fence_output`, `use_schema_constraints`).
|
||||
|
||||
Réponse : `{ "documents": [ { "extractions": [ … ] } ] }` avec, par extraction, `extraction_class`, `extraction_text`, `attributes`, et `char_interval` `{ "start", "end" }` si présent.
|
||||
|
||||
## Intégration architecture
|
||||
|
||||
Ce service complète le socle décrit dans [system-architecture.md](../system-architecture.md) : un client (éditeur type Lapce, gateway, script) peut appeler l’extraction structurée **sans** embarquer Python dans l’UI, tant que le réseau local et le token le permettent.
|
||||
|
||||
## Références
|
||||
|
||||
- Dépôt amont : [https://github.com/google/langextract](https://github.com/google/langextract)
|
||||
- PyPI : [https://pypi.org/project/langextract/](https://pypi.org/project/langextract/)
|
||||
32
docs/features/lapce-porting-roadmap.md
Normal file
32
docs/features/lapce-porting-roadmap.md
Normal file
@ -0,0 +1,32 @@
|
||||
# Portage AnythingLLM Workspaces → Lapce (`core_ide/`)
|
||||
|
||||
L’extension [extensions/anythingllm-workspaces/](../../extensions/anythingllm-workspaces/) cible **VS Code / Cursor** (`vscode` API). Lapce utilise un **modèle de plugins** distinct (Volt / WASI, RPC). Ce document découpe le travail en **phases** pour une interface cohérente avec [platform-target.md](../platform-target.md).
|
||||
|
||||
## Phase 1 — Connectivité sans webview
|
||||
|
||||
- Préférences Lapce (équivalent `anythingllm.baseUrl`, `apiKey`, `reposApiBaseUrl`, `reposApiToken`) — stockage secrets hors dépôt.
|
||||
- Commandes palette :
|
||||
- Lister les workspaces AnythingLLM → ouvrir URL dans le **navigateur système**.
|
||||
- Ouvrir l’UI web AnythingLLM.
|
||||
- Client HTTP vers `repos-devtools-server` et API AnythingLLM (réutiliser la logique des fichiers TypeScript comme **spécification** ; implémenter en Rust dans Lapce ou via petit binaire Node invoqué — choix d’équipe).
|
||||
- Pas de panneau Dev tools ; pas de sync RAG initiale depuis l’IDE.
|
||||
|
||||
## Phase 2 — Parité « Dev tools » et sync RAG
|
||||
|
||||
- Panneau ou vue dédiée : saisie des lignes de commande (`/repos-clone-sync`, `/workspace-sync`, …) comme [extensions/anythingllm-workspaces/README.md](../../extensions/anythingllm-workspaces/README.md).
|
||||
- Réimplémenter **initialRagSync** + `.4nkaiignore` (crate `ignore` ou équivalent Rust).
|
||||
- Ouvrir le dossier dépôt dans Lapce après clone (API workspace Lapce).
|
||||
|
||||
## Phase 3 — Orchestrateur
|
||||
|
||||
- Raccorder les commandes Lapce à [orchestrator-api.md](./orchestrator-api.md) plutôt qu’aux services en dur, pour centraliser tokens et politiques par `env`.
|
||||
|
||||
## Dépendances
|
||||
|
||||
- [core-ide.md](../core-ide.md) — build Lapce.
|
||||
- [orchestrator-api.md](./orchestrator-api.md) — routage cible.
|
||||
- [anythingllm-workspaces.md](../anythingllm-workspaces.md) — principe workspace par projet.
|
||||
|
||||
## Risque principal
|
||||
|
||||
Écart de capacités entre **webview VS Code** et **UI Lapce** : prévoir une **vue minimale** (terminal + buffer sortie) si webview complète retardée.
|
||||
84
docs/features/local-office.md
Normal file
84
docs/features/local-office.md
Normal file
@ -0,0 +1,84 @@
|
||||
# Local Office — API documents Office (programmatique)
|
||||
|
||||
## Emplacement dans le monorepo
|
||||
|
||||
Le code et la doc d’exploitation détaillée sont sous **[`services/local-office/`](../../services/local-office/README.md)** (service HTTP local, au même niveau que les autres dossiers de `services/`). L’ancien dépôt forge `git.4nkweb.com/4nk/local_office` a été **fusionné par copie de fichiers** ; le dépôt distant peut être supprimé.
|
||||
|
||||
## Rôle produit
|
||||
|
||||
| Besoin | Réponse |
|
||||
|--------|---------|
|
||||
| Édition **riche** navigateur / bureautique métier | **ONLYOFFICE** (couche doc-services existante) |
|
||||
| **Automatisation** : upload, remplacements de texte, insertion de paragraphes dans un **docx** via HTTP + JSON | **Local Office** |
|
||||
| RAG / conversations documentaires | **AnythingLLM** |
|
||||
|
||||
Local Office **ne remplace pas** ONLYOFFICE : il couvre les flux **programmatiques** et **intégrations tierces légères** (clé API, pas d’UI WYSIWYG intégrée ici).
|
||||
|
||||
## Stack technique
|
||||
|
||||
- **FastAPI** + **Uvicorn**
|
||||
- **SQLite** pour les métadonnées des documents
|
||||
- Fichiers sur disque (`STORAGE_PATH`)
|
||||
- **python-docx** pour les commandes sur les docx
|
||||
- Auth : en-tête **`X-API-Key`** (liste de clés dans `API_KEYS`)
|
||||
- **slowapi** : rate limit par clé (`RATE_LIMIT_PER_MINUTE`)
|
||||
|
||||
## Variables d’environnement
|
||||
|
||||
| Variable | Obligatoire | Description |
|
||||
|----------|-------------|-------------|
|
||||
| `API_KEYS` | oui | Clés séparées par des virgules ; chaque document est rattaché à la clé qui l’a créé |
|
||||
| `STORAGE_PATH` | non | Répertoire des fichiers (défaut `./data/files`) |
|
||||
| `DATABASE_PATH` | non | Chemin SQLite (défaut `./data/local_office.db`) |
|
||||
| `MAX_UPLOAD_BYTES` | non | Taille max upload (défaut 20 Mo) |
|
||||
| `RATE_LIMIT_PER_MINUTE` | non | Plafond de requêtes par minute et par clé (défaut 60) |
|
||||
|
||||
Copier [`services/local-office/.env.example`](../../services/local-office/.env.example) vers `.env` **hors commit** ; ne pas commiter de secrets. Alignement possible avec la convention projet `.secrets/<env>/` pour l’injection sur l’hôte.
|
||||
|
||||
## Exécution
|
||||
|
||||
```bash
|
||||
cd services/local-office
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
export API_KEYS='votre-cle'
|
||||
uvicorn app.main:app --host 127.0.0.1 --port 8000
|
||||
```
|
||||
|
||||
- OpenAPI / Swagger : `http://127.0.0.1:8000/docs` (selon hôte/port).
|
||||
- Le README amont propose parfois `--host 0.0.0.0` pour tests ; en **smart_ide**, préférer **`127.0.0.1`** sur le serveur et un **reverse proxy TLS** vers l’extérieur.
|
||||
|
||||
## API (résumé)
|
||||
|
||||
Toutes les routes exigent **`X-API-Key`**.
|
||||
|
||||
| Méthode | Chemin | Action |
|
||||
|---------|--------|--------|
|
||||
| POST | `/documents` | Upload multipart (`Content-Type` correct pour docx / xlsx / pptx) |
|
||||
| GET | `/documents` | Liste des documents de la clé |
|
||||
| GET | `/documents/{id}` | Métadonnées |
|
||||
| GET | `/documents/{id}/file` | Téléchargement |
|
||||
| POST | `/documents/{id}/commands` | Commandes JSON (docx : `replaceText`, `insertParagraph`) |
|
||||
| DELETE | `/documents/{id}` | Suppression |
|
||||
|
||||
Les répertoires `data/` sont listés dans [`services/local-office/.gitignore`](../../services/local-office/.gitignore) : données et base **locales à l’instance**.
|
||||
|
||||
## Intégration smart_ide
|
||||
|
||||
- **Orchestrateur / gateway** : router les intentions « modifier un modèle docx par script » ou « pipeline documentaire sans UI » vers cette API plutôt que vers ONLYOFFICE quand c’est suffisant.
|
||||
- **Policy / OpenShell** : nommer un droit du type **accès API Local Office** (clé dédiée, réseau autorisé) dans les profils agents.
|
||||
- **Déploiement** : cohérent avec [deployment-target.md](../deployment-target.md) — instance en pratique sur le **serveur** où vivent les autres services.
|
||||
|
||||
## Documentation détaillée (sources dans `services/local-office/docs/`)
|
||||
|
||||
| Fichier | Contenu |
|
||||
|---------|---------|
|
||||
| [services/local-office/README.md](../../services/local-office/README.md) | Installation, routes, résumé API |
|
||||
| [services/local-office/docs/features/local-office-api.md](../../services/local-office/docs/features/local-office-api.md) | Fiche fonctionnelle (impacts, sécurité, déploiement) |
|
||||
| [services/local-office/docs/architecture-proposal.md](../../services/local-office/docs/architecture-proposal.md) | Pistes ONLYOFFICE / hybride / WOPI |
|
||||
|
||||
## Voir aussi
|
||||
|
||||
- [system-architecture.md](../system-architecture.md) — couche **doc-services**, routage, cartographie `services/local-office/`
|
||||
- [services.md](../services.md) — vue d’ensemble des services sur l’hôte
|
||||
57
docs/features/orchestrator-api.md
Normal file
57
docs/features/orchestrator-api.md
Normal file
@ -0,0 +1,57 @@
|
||||
# Orchestrateur `smart_ide` — contrat HTTP
|
||||
|
||||
## Rôle
|
||||
|
||||
Service **sans LLM** qui route les **intentions** utilisateur (ou commandes normalisées) vers :
|
||||
|
||||
- **Ollama** (génération / chat) ;
|
||||
- **AnythingLLM** (RAG, workspaces — via API documentaire existante) ;
|
||||
- **Micro-services** sous `services/` ([API/README.md](../API/README.md)) ;
|
||||
- **ia-dev-gateway** ([ia-dev-service.md](./ia-dev-service.md)) pour agents et déploiements.
|
||||
|
||||
L’orchestrateur applique une **table de routage** déclarative (fichier de config versionné ou chargé au démarrage) et refuse explicitement les intentions non couvertes.
|
||||
|
||||
## Authentification
|
||||
|
||||
`Authorization: Bearer <ORCHESTRATOR_TOKEN>` sur toutes les routes sauf `GET /health` (spécification). Les appels sortants réutilisent les tokens propres à chaque cible (repos-devtools, regex search, ia-dev-gateway, etc.), injectés par variables d’environnement **hors dépôt**.
|
||||
|
||||
## Endpoints (spécification)
|
||||
|
||||
| Méthode | Chemin | Description |
|
||||
|---------|--------|-------------|
|
||||
| `GET` | `/health` | Liveness |
|
||||
| `POST` | `/v1/route` | Corps : `{ "intent", "context"?, "projectId"?, "env"? }` → réponse : `{ "target": "ollama" \| "anythingllm" \| "service" \| "ia_dev", "action", "request": { ... } }` ou exécution proxy selon politique |
|
||||
| `POST` | `/v1/execute` | Exécute une chaîne déjà résolue (optionnel v1 — peut être fusionné avec `/v1/route` en mode `dryRun: false`) |
|
||||
| `GET` | `/v1/timeline` | Agrège derniers événements (runs gateway, indexation — source à brancher sur store journal) |
|
||||
|
||||
La référence OpenAPI détaillée : [API/orchestrator.md](../API/orchestrator.md).
|
||||
|
||||
## Routage typique
|
||||
|
||||
| Intention (exemple) | Cible |
|
||||
|---------------------|--------|
|
||||
| `code.complete`, `chat.local` | Proxy HTTP vers Ollama (`OLLAMA_URL`) |
|
||||
| `rag.query`, `workspace.list` | Client AnythingLLM (`ANYTHINGLLM_BASE_URL` + API key) |
|
||||
| `git.clone`, `git.list` | repos-devtools-server |
|
||||
| `search.regex` | agent-regex-search-api |
|
||||
| `extract.entities` | langextract-api |
|
||||
| `doc.office.upload` | local-office (`X-API-Key`) |
|
||||
| `agent.run`, `deploy.trigger` | ia-dev-gateway |
|
||||
|
||||
## Variables d’environnement (cible)
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `ORCHESTRATOR_TOKEN` | Secret Bearer clients |
|
||||
| `ORCHESTRATOR_HOST` / `ORCHESTRATOR_PORT` | Bind (défaut `127.0.0.1:37145`) |
|
||||
| `OLLAMA_URL` | Base URL Ollama sur la machine IA |
|
||||
| `ANYTHINGLLM_BASE_URL` | URL API AnythingLLM |
|
||||
| `ANYTHINGLLM_API_KEY` | Clé API |
|
||||
| `REPOS_DEVTOOLS_URL`, `REPOS_DEVTOOLS_TOKEN` | … |
|
||||
| `IA_DEV_GATEWAY_URL`, `IA_DEV_GATEWAY_TOKEN` | … |
|
||||
|
||||
Les valeurs diffèrent par **environnement** (test / pprod / prod) — voir [platform-target.md](../platform-target.md).
|
||||
|
||||
## Implémentation
|
||||
|
||||
Un serveur de **routage stub** existe sous [`services/smart-ide-orchestrator/`](../../services/smart-ide-orchestrator/) (`ORCHESTRATOR_TOKEN`, `npm run build && npm start`). Il résout les intentions documentées ci-dessus et enregistre la timeline ; le **forward HTTP** vers les cibles est à compléter (`fetch` + jetons par service).
|
||||
@ -1,34 +0,0 @@
|
||||
# repos-devtools-server + panneau « Dev tools » (extension AnythingLLM)
|
||||
|
||||
**Author:** 4NK
|
||||
|
||||
## Objectif
|
||||
|
||||
Sur l’hôte qui porte les clones (ex. `192.168.1.164`, racine `/home/ncantu/code`) :
|
||||
|
||||
- exposer une **API HTTP locale** (git clone branche `test`, liste des dépôts, résolution de chemin) ;
|
||||
- depuis l’extension **AnythingLLM Workspaces**, fournir un **panneau Webview** pour saisir des commandes texte et afficher la réponse ;
|
||||
- enchaîner avec l’**API développeur AnythingLLM** pour **vérifier / créer** un workspace dont le nom (ou slug) correspond au dépôt.
|
||||
|
||||
## Impacts
|
||||
|
||||
- Nouveau service : `services/repos-devtools-server/` (Node 20+, écoute `127.0.0.1`, Bearer obligatoire via `REPOS_DEVTOOLS_TOKEN`).
|
||||
- Extension version **0.2.0** : réglages `anythingllm.reposApiBaseUrl`, `anythingllm.reposApiToken`, commande **AnythingLLM: Dev tools panel**, fichiers `media/devTools.js`, logique `workspaceEnsure`, `POST /api/v1/workspace/new`.
|
||||
|
||||
## Modifications
|
||||
|
||||
- Serveur : `POST /repos-clone`, `GET /repos-list`, `POST /repos-load`.
|
||||
- Extension : parseur de lignes, client HTTP repos, `ensureWorkspaceForRepoName`, panneau Webview.
|
||||
|
||||
## Modalités de déploiement
|
||||
|
||||
1. Sur la machine des clones : définir `REPOS_DEVTOOLS_TOKEN`, optionnellement `REPOS_DEVTOOLS_ROOT`, `npm run build && npm start` (voir `services/repos-devtools-server/README.md`).
|
||||
2. Dans Cursor / VS Code (même hôte ou tunnel vers `:37140`) : renseigner `anythingllm.reposApiBaseUrl` et `anythingllm.reposApiToken`.
|
||||
3. Recompiler / réinstaller l’extension (`.vsix` ou workspace dev).
|
||||
|
||||
## Modalités d’analyse
|
||||
|
||||
- Erreur **401** sur l’API repos : token extension ≠ `REPOS_DEVTOOLS_TOKEN`.
|
||||
- Erreur **403** AnythingLLM : clé API application (pas secret nginx Ollama).
|
||||
- **409** clone : répertoire cible déjà présent sous `REPOS_DEVTOOLS_ROOT`.
|
||||
- **git clone** échoue si la branche `test` n’existe pas sur le remote (comportement git nominal).
|
||||
53
docs/features/sso-docv-enso.md
Normal file
53
docs/features/sso-docv-enso.md
Normal file
@ -0,0 +1,53 @@
|
||||
# SSO — front plateforme et docv (filière Enso)
|
||||
|
||||
## Objectif
|
||||
|
||||
Permettre au **front web** de la plateforme `smart_ide` (déployé par environnement : test, pprod, prod) de déléguer l’authentification à **docv** via **OpenID Connect (OIDC)** , sans coupler le monorepo au code du dépôt Enso tant que celui-ci n’est pas disponible sur la machine de documentation.
|
||||
|
||||
## Rôles
|
||||
|
||||
| Composant | Rôle |
|
||||
|-----------|------|
|
||||
| **Navigateur utilisateur** | Redirection vers docv (authorization endpoint) |
|
||||
| **docv / IdP Enso** | Émet `id_token` / `access_token`, expose JWKS |
|
||||
| **Front SPA** | Échange code OAuth (PKCE recommandé), stocke session |
|
||||
| **Backend API** (orchestrateur ou BFF) | Valide JWT (signature JWKS, `iss`, `aud`, `exp`), mappe rôles → droits policy |
|
||||
|
||||
## Flux (authorization code + PKCE)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Browser
|
||||
participant Front as Front_SPA
|
||||
participant Docv as docv_IdP
|
||||
participant API as smart_ide_API
|
||||
Browser->>Front: open app
|
||||
Front->>Docv: redirect authorize
|
||||
Docv->>Browser: login consent
|
||||
Browser->>Front: callback code
|
||||
Front->>Docv: token endpoint
|
||||
Docv->>Front: access_token id_token
|
||||
Front->>API: API calls Authorization Bearer
|
||||
API->>API: validate JWT JWKS
|
||||
```
|
||||
|
||||
## Paramètres à fixer avec le dépôt Enso
|
||||
|
||||
- `issuer` (URL stable par env)
|
||||
- `client_id` / `client_secret` ou client public + PKCE
|
||||
- Scopes : au minimum `openid`, `profile`, `email` ; scopes métier docv si besoin
|
||||
- **Audience** (`aud`) attendue par l’API `smart_ide`
|
||||
- Mapping **rôles / groupes** → profils OpenShell (lecture seule, deploy pprod, etc.)
|
||||
|
||||
## Environnements
|
||||
|
||||
Un **client OAuth par env** (test / pprod / prod) ou un seul client avec **claims** d’environnement — à trancher avec la sécurité Enso. Les URLs de callback du front diffèrent par déploiement.
|
||||
|
||||
## Références internes
|
||||
|
||||
- [platform-target.md](../platform-target.md) — matrice test / pprod / prod
|
||||
- [deployment-target.md](../deployment-target.md) — TLS, pas de HTTP de contournement
|
||||
|
||||
## Suite
|
||||
|
||||
Lorsque le dépôt Enso (ex. `/home/desk/code/enso/`) est accessible, compléter ce document avec les **chemins d’endpoints** réels et captures d’écran des écrans docv concernés.
|
||||
@ -1,43 +0,0 @@
|
||||
# AnythingLLM extension — 403 « No valid api key found »
|
||||
|
||||
**Author:** 4NK
|
||||
|
||||
## Symptôme
|
||||
|
||||
Commande **AnythingLLM: List workspaces** → erreur du type :
|
||||
|
||||
`AnythingLLM API 403: {"error":"No valid api key found."}`
|
||||
|
||||
## Cause
|
||||
|
||||
Le middleware amont `validApiKey` lit `Authorization`, extrait le jeton après `Bearer `, puis appelle `ApiKey.get({ secret })` sur la base AnythingLLM. Toute valeur absente de cette base produit la même réponse 403.
|
||||
|
||||
## Root cause fréquente
|
||||
|
||||
Confusion entre :
|
||||
|
||||
- le **secret Bearer nginx** utilisé pour `https://ia.enso.4nkweb.com/ollama/…` (documenté dans `deploy/nginx/README-ia-enso.md`) ;
|
||||
- une **clé API AnythingLLM** créée dans l’UI : **Settings → API Keys**.
|
||||
|
||||
Ce sont deux mécanismes indépendants. Le secret nginx n’est **pas** enregistré comme clé API dans AnythingLLM.
|
||||
|
||||
## Correctifs côté utilisateur
|
||||
|
||||
1. Ouvrir l’UI AnythingLLM (`anythingllm.baseUrl`).
|
||||
2. **Settings → API Keys** : créer une clé si besoin, copier le secret affiché.
|
||||
3. Coller ce secret dans `anythingllm.apiKey` (réglages **Utilisateur** de l’éditeur).
|
||||
|
||||
## Correctifs côté code / doc
|
||||
|
||||
- README de l’extension : rappel explicite nginx vs clé AnythingLLM.
|
||||
- Client : normalisation `normalizeApiSecret` — si l’utilisateur a collé `Bearer <secret>`, le préfixe est retiré avant l’envoi (évite un jeton parsé comme `Bearer` par erreur).
|
||||
|
||||
## Modalités d’analyse
|
||||
|
||||
- Vérifier la réponse HTTP brute (403 + corps JSON).
|
||||
- Comparer la valeur configurée avec l’origine (fichier map nginx vs écran API Keys).
|
||||
- Tester avec `curl` : `curl -sS -H "Authorization: Bearer <secret AnythingLLM>" "<baseUrl>/api/v1/workspaces"`.
|
||||
|
||||
## Modalités de déploiement
|
||||
|
||||
Redéployer / réinstaller l’extension après modification du client (`npm run compile` ou nouveau `.vsix`). Aucun changement nginx requis pour ce diagnostic.
|
||||
15
docs/ia_dev-project-smart_ide.md
Normal file
15
docs/ia_dev-project-smart_ide.md
Normal file
@ -0,0 +1,15 @@
|
||||
# Projet `ia_dev` : `smart_ide`
|
||||
|
||||
Le dépôt **smart_ide** est enregistré dans le sous-module **`ia_dev`** sous l’identifiant de projet **`smart_ide`**, pour les agents, le ticketing Gitea et la doc wiki alignés sur la forge **4nk/smart_ide**.
|
||||
|
||||
## Fichier de configuration
|
||||
|
||||
- **`ia_dev/projects/smart_ide/conf.json`** — chemins machine (`project_path`), URLs wiki et issues (`https://git.4nkweb.com/4nk/smart_ide/...`), boîtes mail autorisées pour le ticketing (envs test / pprod / prod).
|
||||
|
||||
Adapter **`project_path`** (et champs dérivés si vous ajoutez `build_dirs` / `deploy`) sur chaque poste ou serveur où `ia_dev` exécute des commandes sur ce dépôt.
|
||||
|
||||
## Liens
|
||||
|
||||
- Dépôt : `https://git.4nkweb.com/4nk/smart_ide`
|
||||
- Sous-module `ia_dev` : [docs/ia_dev-submodule.md](./ia_dev-submodule.md)
|
||||
- Gateway dev : [docs/features/ia-dev-service.md](./features/ia-dev-service.md)
|
||||
58
docs/ia_dev-submodule.md
Normal file
58
docs/ia_dev-submodule.md
Normal file
@ -0,0 +1,58 @@
|
||||
# Submodule `ia_dev`
|
||||
|
||||
The repository [4nk/ia_dev](https://git.4nkweb.com/4nk/ia_dev.git) is integrated as a **Git submodule** at `./ia_dev`.
|
||||
|
||||
It holds the **centralized AI agent team** (definitions under `.cursor/agents/`, `.cursor/rules/`), `deploy/`, `gitea-issues/`, `projects/<id>/conf.json`, etc. Execution remains **from the `ia_dev` root** per upstream README; `smart_ide` provides the surrounding IDE vision, host scripts, and systemd units.
|
||||
|
||||
## Clone with submodule
|
||||
|
||||
```bash
|
||||
git clone --recurse-submodules https://git.4nkweb.com/4nk/smart_ide.git
|
||||
cd smart_ide
|
||||
```
|
||||
|
||||
## Submodule already present but empty
|
||||
|
||||
```bash
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
## Update to latest `ia_dev` commit
|
||||
|
||||
```bash
|
||||
cd ia_dev
|
||||
git fetch origin
|
||||
git checkout <branch-or-tag> # e.g. main
|
||||
cd ..
|
||||
git add ia_dev
|
||||
git commit -m "chore: bump ia_dev submodule"
|
||||
```
|
||||
|
||||
## SSH remote for `ia_dev` (optional)
|
||||
|
||||
If you use SSH instead of HTTPS for the submodule:
|
||||
|
||||
```bash
|
||||
git config submodule.ia_dev.url git@git.4nkweb.com:4nk/ia_dev.git
|
||||
```
|
||||
|
||||
(Requires host key and deploy key configured for Gitea.)
|
||||
|
||||
## Relation to `smart_ide`
|
||||
|
||||
| Repository | Role |
|
||||
|------------|------|
|
||||
| **smart_ide** | IDE target UX, local AI stack scripts, systemd, docs for deployment |
|
||||
| **ia_dev** (submodule) | Agent registry, project configs, deploy/ticketing/notary pipelines |
|
||||
|
||||
The future **agent gateway** should treat `./ia_dev` as the canonical checkout path on the server unless overridden by configuration. See [system-architecture.md](./system-architecture.md).
|
||||
|
||||
## Trajectoire : service `ia-dev-gateway`
|
||||
|
||||
Un service HTTP dédié ([features/ia-dev-service.md](./features/ia-dev-service.md), [API/ia-dev-gateway.md](./API/ia-dev-gateway.md)) prendra le relais pour les **clients** (Lapce, front, orchestrateur) : le sous-module reste la **source de vérité des fichiers** `ia_dev` jusqu’à migration vers un **fork** cloné ou embarqué au même chemin (`IA_DEV_ROOT`).
|
||||
|
||||
1. **Phase actuelle** : sous-module + exécution manuelle / scripts depuis la racine `ia_dev`.
|
||||
2. **Phase gateway** : binaire `ia-dev-gateway` sur l’hôte, `IA_DEV_ROOT` pointant vers `./ia_dev`.
|
||||
3. **Phase fork** : le fork `ia_dev` est référencé par `smart_ide` (sous-module mis à jour vers le fork ou remplacement documenté) ; le gateway ne change pas de contrat HTTP.
|
||||
|
||||
Ne pas supprimer le sous-module tant que la CI et les postes de dev ne sont pas alignés sur le fork et le service.
|
||||
30
docs/implementation-rollout.md
Normal file
30
docs/implementation-rollout.md
Normal file
@ -0,0 +1,30 @@
|
||||
# Déroulé du plan plateforme — état
|
||||
|
||||
Ce document résume l’exécution du plan « Plateforme IDE multi-env » et les suites.
|
||||
|
||||
## Réalisé (documentation)
|
||||
|
||||
- [platform-target.md](./platform-target.md)
|
||||
- [features/ia-dev-service.md](./features/ia-dev-service.md), [API/ia-dev-gateway.md](./API/ia-dev-gateway.md)
|
||||
- [features/orchestrator-api.md](./features/orchestrator-api.md), [API/orchestrator.md](./API/orchestrator.md)
|
||||
- [features/lapce-porting-roadmap.md](./features/lapce-porting-roadmap.md)
|
||||
- [features/sso-docv-enso.md](./features/sso-docv-enso.md)
|
||||
- [features/browser-automation-criteria.md](./features/browser-automation-criteria.md)
|
||||
- Mises à jour : [system-architecture.md](./system-architecture.md), [deployment-target.md](./deployment-target.md), [ia_dev-submodule.md](./ia_dev-submodule.md), [API/README.md](./API/README.md), [services.md](./services.md)
|
||||
|
||||
## Réalisé (code minimal)
|
||||
|
||||
- **`services/ia-dev-gateway`** : Node/TS, `GET /v1/agents`, `POST /v1/runs` (stub), SSE événements, `IA_DEV_GATEWAY_TOKEN`.
|
||||
- **`services/smart-ide-orchestrator`** : Node/TS, `POST /v1/route`, `POST /v1/execute` (timeline, pas de forward automatique), `ORCHESTRATOR_TOKEN`.
|
||||
|
||||
## À réaliser (suite)
|
||||
|
||||
- Brancher le **runner** réel sur `POST /v1/runs` (scripts `ia_dev`).
|
||||
- **`fetch`** depuis l’orchestrateur vers Ollama / AnythingLLM / services avec les bons secrets.
|
||||
- Portage Lapce selon [features/lapce-porting-roadmap.md](./features/lapce-porting-roadmap.md).
|
||||
|
||||
## Ordre de déploiement recommandé
|
||||
|
||||
1. Valider **test** avec orchestrateur + gateway stubs + services existants.
|
||||
2. Brancher le **runner** réel `ia_dev` sur `POST /v1/runs`.
|
||||
3. Étendre **pprod** / **prod** (secrets, TLS, SSO docv).
|
||||
@ -1,64 +1,21 @@
|
||||
# Infrastructure
|
||||
# Infrastructure — accès hôte et réseau
|
||||
|
||||
## Scope
|
||||
## Première cible
|
||||
|
||||
This repository ships shell scripts used on Ubuntu workstations and related LAN hosts. It does **not** define cloud Terraform or CI; it documents how those scripts map to the **private LAN** layout used with the 4NK bastion model.
|
||||
Un **poste Linux client** se connecte en **SSH** à un **serveur** qui porte le socle IA, les clones Git et les services associés. Voir [deployment-target.md](./deployment-target.md).
|
||||
|
||||
## First deployment shape (client / server)
|
||||
## Identité SSH
|
||||
|
||||
The **primary deployment target** is a **Linux client** that connects over **SSH** to a **remote server** where the **AI stack** (Ollama, AnythingLLM, etc.) and **Git repositories** live. Install scripts in this repo apply mainly to that **server** (or to a LAN workstation that plays the same role). The client uses SSH (and optionally port forwarding) to reach services that bind to the server’s loopback or internal interfaces. See [deployment-target.md](./deployment-target.md).
|
||||
- Script d’aide : [`../setup/add-ssh-key.sh`](../setup/add-ssh-key.sh) ; autres scripts d’hôte dans [`../setup/`](../setup/).
|
||||
- Le compte utilisateur sur le serveur doit être autorisé à atteindre les chemins où tournent les agents, AnythingLLM et les données projet.
|
||||
|
||||
## LAN host roles (reference)
|
||||
## Réseau
|
||||
|
||||
Private segment **192.168.1.0/24** (DHCP with MAC reservations). The table matches the host lists in `add-ssh-key.sh`.
|
||||
- Les services écoutant sur `127.0.0.1` du **serveur** ne sont pas joignables depuis le client sans **tunnel SSH** (`ssh -L …`), **ProxyJump**, VPN ou équivalent, selon la politique du LAN / bastion.
|
||||
- Ne pas exposer en clair sur Internet des API internes (Local Office, micro-services `services/*`) sans reverse proxy, TLS et contrôle d’accès.
|
||||
|
||||
| IP | Role |
|
||||
|----|------|
|
||||
| 192.168.1.100 | Proxy / bastion (public entry via DynDNS `4nk.myftp.biz`) |
|
||||
| 192.168.1.101 | test |
|
||||
| 192.168.1.102 | pre-production |
|
||||
| 192.168.1.103 | production |
|
||||
| 192.168.1.104 | services (Git, Mempool, Rocket.Chat, …) |
|
||||
| 192.168.1.105 | bitcoin |
|
||||
| 192.168.1.173 | ia |
|
||||
| 192.168.1.164 | Example workstation on LAN (included in `LAN_DIRECT` list) |
|
||||
## Documentation liée
|
||||
|
||||
Internet access to backends uses **SSH ProxyJump** via `ncantu@4nk.myftp.biz` (see `JUMP` in `add-ssh-key.sh`). On the same LAN, direct `ssh ncantu@192.168.1.x` is valid.
|
||||
|
||||
## Reverse proxy `ia.enso.4nkweb.com` (Ollama / AnythingLLM)
|
||||
|
||||
Hostname TLS sur le **proxy** `192.168.1.100` : préfixes `/ollama` et `/anythingllm` vers l’hôte LAN `192.168.1.164` (ports `11434` et `3001`, voir `deploy/nginx/sites/ia.enso.4nkweb.com.conf`). **`/ollama/`** protégé par **Bearer** nginx (`map` dans `conf.d`) ; AnythingLLM reste derrière son auth applicative.
|
||||
|
||||
Documentation opérationnelle : [deploy/nginx/README-ia-enso.md](../deploy/nginx/README-ia-enso.md). Fiche évolution : [features/ia-enso-nginx-proxy-ollama-anythingllm.md](./features/ia-enso-nginx-proxy-ollama-anythingllm.md).
|
||||
|
||||
## Scripts (infrastructure / access)
|
||||
|
||||
### `add-ssh-key.sh`
|
||||
|
||||
Appends a fixed **Ed25519 public key** (comment `desk@desk`) to `~/.ssh/authorized_keys` on target hosts.
|
||||
|
||||
| Mode | When to use |
|
||||
|------|-------------|
|
||||
| Default | From a machine that can reach `JUMP` (`ncantu@4nk.myftp.biz`), then ProxyJump to each backend IP. |
|
||||
| `LAN_DIRECT=1` | Same LAN: direct SSH to each IP in `LAN_IPS` (proxy, backends, `.164`). No bastion hostname. |
|
||||
| `ADD_KEY_LOCAL=1` | Already logged in on the target host: update **current user** only (e.g. workstation `.164`). |
|
||||
|
||||
**Do not run with `sudo`:** the SSH client would use `/root/.ssh` and fail with `Permission denied (publickey)`.
|
||||
|
||||
**Environment (optional):** `JUMP`, `BACKEND_USER`, `SSH_IDENTITY_FILE`, `SSH_VERBOSE=1`, `EXTRA_LAN_IPS` (with `LAN_DIRECT=1`).
|
||||
|
||||
### `add-sudo-nopasswd-ncantu.sh`
|
||||
|
||||
One-time **root** execution: creates `/etc/sudoers.d/99-ncantu-nopasswd` with `ncantu ALL=(ALL) NOPASSWD: ALL`, `chmod 440`, `visudo -c`. Use only where this policy is explicitly required.
|
||||
|
||||
## Data paths (host)
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| `$HOME/anythingllm` | AnythingLLM Docker bind mount (storage + `.env`), default from `install-anythingllm-docker.sh` |
|
||||
| `$HOME/.ssh/authorized_keys` | SSH access; updated by `add-ssh-key.sh` modes |
|
||||
|
||||
## Security notes
|
||||
|
||||
- SSH is key-based; the embedded key in `add-ssh-key.sh` is for a designated client (`desk@desk`). Rotate or replace in script if the key is compromised.
|
||||
- Passwordless sudo reduces interactive friction and **increases** local privilege impact; scope to trusted machines only.
|
||||
- [deployment-target.md](./deployment-target.md)
|
||||
- [services.md](./services.md)
|
||||
- [system-architecture.md](./system-architecture.md)
|
||||
|
||||
84
docs/platform-target.md
Normal file
84
docs/platform-target.md
Normal file
@ -0,0 +1,84 @@
|
||||
# Plateforme de développement en ligne — cible produit
|
||||
|
||||
Ce document fixe la **vision d’ensemble** du monorepo `smart_ide` : interface unifiée, services locaux, trois environnements d’exploitation, IA sur hôte unique (variante courante), SSO avec docv, et règle d’arbitrage pour un **service navigateur** optionnel.
|
||||
|
||||
## Objectifs
|
||||
|
||||
- **Création logicielle** : édition, Git, agents, scripts et mémoire documentaire dans un flux cohérent (intentions, pas explorateur comme flux nominal — voir [ux-navigation-model.md](./ux-navigation-model.md)).
|
||||
- **Documents + IA** : ONLYOFFICE (bureautique riche), Local Office (API programmatique docx), AnythingLLM (RAG par projet), Ollama (inférence locale).
|
||||
- **Apprentissage / indexation automatisés** : pipelines **déterministes** (sync après pull, `.4nkaiignore`, journal d’indexation consultable) — pas d’« apprentissage » opaque non audité.
|
||||
- **Évolution du produit** : recettes versionnées, timeline (événements gateway, Git, indexation) — voir [system-architecture.md](./system-architecture.md).
|
||||
|
||||
## Variantes de déploiement
|
||||
|
||||
| Variante | Description | Doc |
|
||||
|----------|-------------|-----|
|
||||
| **Machine IA unique** | Ollama et AnythingLLM sur le **même hôte** ; services `smart_ide` sur cet hôte ou derrière le même reverse proxy. Lapce et/ou front web consomment les APIs localement ou via TLS interne. | Ce fichier ; [deployment-target.md](./deployment-target.md) § variante |
|
||||
| **Client Linux + SSH** | Poste client ; socle IA et repos sur **serveur distant** ; tunnels ou VPN pour les URLs « localhost » côté serveur. | [deployment-target.md](./deployment-target.md) |
|
||||
|
||||
Les deux variantes peuvent coexister selon l’équipe ; la **matrice d’environnement** (test / pprod / prod) s’applique dans les deux cas.
|
||||
|
||||
## Trois environnements : test, pprod, prod
|
||||
|
||||
Chaque environnement possède sa propre **configuration** (non versionnée : `.secrets/<env>/`, variables d’hébergement) :
|
||||
|
||||
| Paramètre | Exemple de distinction |
|
||||
|-----------|-------------------------|
|
||||
| URL publique AnythingLLM | Sous-domaine ou chemin dédié par env |
|
||||
| Clés API AnythingLLM | Une clé ou jeu de clés par env |
|
||||
| `REPOS_DEVTOOLS_ROOT`, tokens micro-services | Racine Git et secrets distincts |
|
||||
| URL orchestrateur / ia-dev-gateway | Hôte + port ou route derrière gateway |
|
||||
| SSO (OIDC) | Client OAuth distinct par env, ou même IdP avec `audience` / realm différent |
|
||||
| CORS et reverse proxy | TLS partout ; pas d’alternative HTTP de contournement |
|
||||
|
||||
Les **garde-fous** prod (policy, droits déploiement, refus explicites) sont plus stricts qu’en test ; la doc métier des projets (`ia_dev` / `projects/<id>/`) reste la source des scripts réels.
|
||||
|
||||
## Intégration navigateur (services)
|
||||
|
||||
**Par défaut** : **ne pas** embarquer Chromium / Playwright dans le cœur des autres services. La prévisualisation et l’UI AnythingLLM passent par le **navigateur système** ou un onglet web du shell (Lapce / front).
|
||||
|
||||
**Ouvrir un service dédié** `browser-automation-api` (futur) uniquement si besoin de : capture de rendu, E2E agents, snapshot PDF, scraping sur **allowlist**, tests visuels sans dépendre du poste utilisateur. Critères détaillés : [features/browser-automation-criteria.md](./features/browser-automation-criteria.md).
|
||||
|
||||
## SSO avec docv (Enso)
|
||||
|
||||
Le **front web** de la plateforme peut s’authentifier auprès de **docv** (filière Enso) via **OpenID Connect**. Flux et contrats : [features/sso-docv-enso.md](./features/sso-docv-enso.md). Les endpoints exacts du dépôt Enso se calent lorsque le code docv est disponible sur la machine de build.
|
||||
|
||||
## Chaîne technique de référence
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph envs [Environnements]
|
||||
test[test]
|
||||
pprod[pprod]
|
||||
prod[prod]
|
||||
end
|
||||
subgraph ui [Interfaces]
|
||||
Lapce[Lapce_core_ide]
|
||||
Web[Web_front_SSO]
|
||||
end
|
||||
subgraph orch [Orchestration]
|
||||
Orch[smart_ide_orchestrator]
|
||||
IaGw[ia_dev_gateway]
|
||||
end
|
||||
subgraph ia_host [Hôte_IA_typique]
|
||||
Ollama[Ollama]
|
||||
ALLM[AnythingLLM]
|
||||
Micro[services_micro_HTTP]
|
||||
end
|
||||
envs --> ui
|
||||
Lapce --> Orch
|
||||
Web --> Orch
|
||||
Orch --> Micro
|
||||
Orch --> ALLM
|
||||
Orch --> IaGw
|
||||
IaGw --> Ollama
|
||||
ALLM --> Ollama
|
||||
```
|
||||
|
||||
## Documents liés
|
||||
|
||||
- [system-architecture.md](./system-architecture.md) — couches, gateway, registre agents
|
||||
- [features/ia-dev-service.md](./features/ia-dev-service.md) — service `ia-dev-gateway`
|
||||
- [features/orchestrator-api.md](./features/orchestrator-api.md) — contrat HTTP orchestrateur
|
||||
- [features/lapce-porting-roadmap.md](./features/lapce-porting-roadmap.md) — portage extension → Lapce
|
||||
- [API/README.md](./API/README.md) — index des API services
|
||||
122
docs/services.md
122
docs/services.md
@ -1,111 +1,39 @@
|
||||
# Services
|
||||
# Services sur l’hôte (socle technique)
|
||||
|
||||
## Systemd (local host)
|
||||
|
||||
- **Ollama:** `ollama.service` (official installer). Optional drop-in `OLLAMA_HOST=0.0.0.0:11434` for Docker — see `configure-ollama-for-docker.sh` and [systemd/README.md](../systemd/README.md).
|
||||
- **AnythingLLM:** `anythingllm.service` — Docker container managed by systemd. Install: `sudo ./scripts/install-systemd-services.sh`. Config: `/etc/default/anythingllm` (template `systemd/anythingllm.default`).
|
||||
|
||||
```bash
|
||||
sudo systemctl restart ollama anythingllm
|
||||
sudo systemctl status ollama anythingllm
|
||||
```
|
||||
|
||||
## Where these services run (first deployment)
|
||||
|
||||
For the **first deployment target**, Ollama and AnythingLLM run on the **remote SSH server** that hosts the AI stack and repositories, not necessarily on the user’s Linux laptop. Access from the client may use **SSH local forwarding** or internal hostnames. See [deployment-target.md](./deployment-target.md).
|
||||
|
||||
## Overview
|
||||
|
||||
| Service | Delivery | Default URL / port | Config / persistence |
|
||||
|---------|----------|--------------------|------------------------|
|
||||
| Ollama | systemd (`ollama.service`) | `http://127.0.0.1:11434` (API) | Models under Ollama data dir; listen address via systemd override |
|
||||
| AnythingLLM | Docker (`mintplexlabs/anythingllm`) | `http://localhost:3001` | `$HOME/anythingllm` + `.env` bind-mounted ; **one workspace per project** (see [anythingllm-workspaces.md](./anythingllm-workspaces.md)) |
|
||||
| AnythingLLM Desktop | AppImage (optional) | local Electron app | User profile under `~/.config/anythingllm-desktop` (installer) |
|
||||
Ce document décrit les **services logiciels** typiques sur l’**hôte** (serveur distant **ou** machine IA unique — voir [deployment-target.md](./deployment-target.md) et [platform-target.md](./platform-target.md)), en complément de [system-architecture.md](./system-architecture.md). **Ollama** et **AnythingLLM** peuvent cohabiter sur le même hôte que les micro-services ; l’**orchestrateur** HTTP ([features/orchestrator-api.md](./features/orchestrator-api.md)) et **`ia-dev-gateway`** ([features/ia-dev-service.md](./features/ia-dev-service.md)) sont spécifiés pour unifier les appels depuis Lapce ou le front.
|
||||
|
||||
## Ollama
|
||||
|
||||
- **Install:** official script `https://ollama.com/install.sh` (used on target Ubuntu hosts).
|
||||
- **Service:** `systemctl enable --now ollama` (handled by installer).
|
||||
- **Default bind:** loopback only (`127.0.0.1:11434`), which **blocks** Docker containers on the same host from calling Ollama.
|
||||
- **Rôle** : inférence LLM locale.
|
||||
- **Accès** : URL/port configurés sur l’hôte (souvent `127.0.0.1:11434` côté serveur) ; depuis le client, tunnel SSH si besoin.
|
||||
|
||||
### Expose Ollama to Docker on the same host
|
||||
## AnythingLLM
|
||||
|
||||
Run **`configure-ollama-for-docker.sh`** as root (or equivalent):
|
||||
- **Rôle** : RAG, mémoire documentaire, **un workspace par projet**.
|
||||
- **Déploiement** : souvent Docker sur le même hôte que les dépôts ; chemins de persistance sur l’hôte.
|
||||
- Détail : [anythingllm-workspaces.md](./anythingllm-workspaces.md).
|
||||
|
||||
- Drop-in: `/etc/systemd/system/ollama.service.d/override.conf`
|
||||
- `Environment="OLLAMA_HOST=0.0.0.0:11434"`
|
||||
- `systemctl daemon-reload && systemctl restart ollama`
|
||||
## ONLYOFFICE
|
||||
|
||||
Verify: `ss -tlnp | grep 11434` shows `*:11434`.
|
||||
- **Rôle** : bureautique métier (documents, feuilles, présentations), édition riche.
|
||||
- Utilisé dans la couche **doc-services** de l’architecture ; ne pas confondre avec Local Office.
|
||||
|
||||
### Models (reference)
|
||||
## Local Office (`services/local-office/`)
|
||||
|
||||
- Embeddings for AnythingLLM + Ollama: `ollama pull nomic-embed-text`
|
||||
- Custom name **`qwen3-code-webdev`:** not in the public Ollama library as-is; this repo includes `Modelfile-qwen3-code-webdev` defining an alias (default base: `qwen3-coder:480b-cloud`). Rebuild with `ollama create qwen3-code-webdev -f Modelfile-qwen3-code-webdev` after editing `FROM`.
|
||||
- **Rôle** : **API REST** pour applications tierces ou agents : upload de fichiers Office, métadonnées, téléchargement, **commandes programmatiques** sur les **docx** (`replaceText`, `insertParagraph`). Stockage fichiers + SQLite ; isolation par clé `X-API-Key` ; rate limiting.
|
||||
- **Emplacement dans le monorepo** : [`../services/local-office/`](../services/local-office/).
|
||||
- **Documentation** : [features/local-office.md](./features/local-office.md) et [services/local-office/README.md](../services/local-office/README.md).
|
||||
- **Sécurité** : définir `API_KEYS` via variables d’environnement ou fichiers hors dépôt (voir `.env.example` dans `services/local-office/`). En production, préférer **bind `127.0.0.1`** derrière un reverse proxy TLS plutôt que `0.0.0.0` exposé.
|
||||
- **Périmètre fonctionnel** : édition par commandes **docx** ; xlsx/pptx peuvent être stockés mais les commandes d’édition peuvent renvoyer **400** selon l’implémentation actuelle.
|
||||
|
||||
## AnythingLLM (Docker)
|
||||
## Micro-services HTTP sous `services/`
|
||||
|
||||
### Workspaces and projects
|
||||
Services d’appoint sur **`127.0.0.1`** (souvent auth **Bearer**) : Git devtools, LangExtract, recherche regex, proxy claw, **`ia-dev-gateway`** (agents / runs stub), **`smart-ide-orchestrator`** (routage intentions) — voir tableau dans [system-architecture.md](./system-architecture.md), la **référence API** dans [`API/README.md`](./API/README.md), et README de chaque sous-dossier de [`../services/`](../services/).
|
||||
|
||||
AnythingLLM is used with **dedicated workspaces per project** so RAG memory, documents, and threads stay isolated. A **sync job** (“moulinette”) keeps selected repository files aligned with each workspace. Operational rules: [anythingllm-workspaces.md](./anythingllm-workspaces.md).
|
||||
## Documentation liée
|
||||
|
||||
**Script:** `install-anythingllm-docker.sh`
|
||||
|
||||
- **Image:** `mintplexlabs/anythingllm` (override with `ANYTHINGLLM_IMAGE`).
|
||||
- **Container name:** `anythingllm` (override with `ANYTHINGLLM_CONTAINER_NAME`).
|
||||
- **Ports:** `HOST_PORT:3001` (default `3001:3001`).
|
||||
- **Capabilities:** `--cap-add SYS_ADMIN` (Chromium / document features in container).
|
||||
- **Networking:** `--add-host=host.docker.internal:host-gateway` so the app can reach Ollama on the host at `http://host.docker.internal:11434` once `OLLAMA_HOST` is set as above.
|
||||
- **Volumes:**
|
||||
- `${STORAGE_LOCATION}:/app/server/storage`
|
||||
- `${STORAGE_LOCATION}/.env:/app/server/.env`
|
||||
|
||||
Re-running the script **removes** the existing container by name and starts a new one; data remains in `STORAGE_LOCATION` if the bind path is unchanged.
|
||||
|
||||
### Configure LLM provider (Ollama)
|
||||
|
||||
In `$STORAGE_LOCATION/.env` (mounted into the container), set at minimum:
|
||||
|
||||
- `LLM_PROVIDER='ollama'`
|
||||
- `OLLAMA_BASE_PATH='http://host.docker.internal:11434'`
|
||||
- `OLLAMA_MODEL_PREF='<model name>'` (e.g. `qwen3-code-webdev`)
|
||||
- `EMBEDDING_ENGINE='ollama'`
|
||||
- `EMBEDDING_BASE_PATH='http://host.docker.internal:11434'`
|
||||
- `EMBEDDING_MODEL_PREF='nomic-embed-text:latest'`
|
||||
- `VECTOR_DB='lancedb'` (default stack)
|
||||
|
||||
See upstream `.env.example`:
|
||||
<https://raw.githubusercontent.com/Mintplex-Labs/anything-llm/master/docker/.env.example>
|
||||
|
||||
After editing `.env`, restart the container: `docker restart anythingllm`.
|
||||
|
||||
## AnythingLLM Desktop (AppImage)
|
||||
|
||||
**Script:** `installer.sh` — downloads the official AppImage, optional AppArmor profile, `.desktop` entry. Interactive prompts; not a headless service.
|
||||
|
||||
- Documentation: <https://docs.anythingllm.com>
|
||||
- Use **either** Docker **or** Desktop on the same machine if you want to avoid conflicting ports and duplicate workspaces.
|
||||
|
||||
## Operational checks
|
||||
|
||||
```bash
|
||||
systemctl is-active ollama
|
||||
curl -sS http://127.0.0.1:11434/api/tags | head
|
||||
docker ps --filter name=anythingllm
|
||||
docker exec anythingllm sh -c 'curl -sS http://host.docker.internal:11434/api/tags | head'
|
||||
```
|
||||
|
||||
The last command must succeed after `OLLAMA_HOST=0.0.0.0:11434` and `host.docker.internal` are configured.
|
||||
|
||||
## Public reverse proxy (ia.enso.4nkweb.com)
|
||||
|
||||
When Ollama runs on a LAN host (e.g. `192.168.1.164` via `IA_ENSO_BACKEND_IP` / `deploy/nginx/sites/ia.enso.4nkweb.com.conf`) and must be reached via the **proxy** with HTTPS and a **Bearer** gate on `/ollama/`, use `deploy/nginx/` and **[deploy/nginx/README-ia-enso.md](../deploy/nginx/README-ia-enso.md)** (script `deploy-ia-enso-to-proxy.sh`, checks, troubleshooting).
|
||||
|
||||
**Full URLs**
|
||||
|
||||
- AnythingLLM UI: `https://ia.enso.4nkweb.com/anythingllm/`
|
||||
- Ollama native API example: `https://ia.enso.4nkweb.com/ollama/api/tags` (header `Authorization: Bearer <secret>`)
|
||||
- Cursor / OpenAI-compatible base URL: `https://ia.enso.4nkweb.com/ollama/v1`
|
||||
- Cursor API key: same value as the Bearer secret in nginx `map`
|
||||
|
||||
Feature note: [ia-enso-nginx-proxy-ollama-anythingllm.md](./features/ia-enso-nginx-proxy-ollama-anythingllm.md).
|
||||
- [platform-target.md](./platform-target.md)
|
||||
- [features/local-office.md](./features/local-office.md)
|
||||
- [system-architecture.md](./system-architecture.md)
|
||||
- [anythingllm-workspaces.md](./anythingllm-workspaces.md)
|
||||
- [API/README.md](./API/README.md)
|
||||
|
||||
@ -1,5 +1,52 @@
|
||||
# Architecture système — IDE, agents, runtime, mémoire
|
||||
|
||||
Vue produit multi-environnements, SSO et option navigateur : [platform-target.md](./platform-target.md).
|
||||
|
||||
## Objectifs du projet (rappel)
|
||||
|
||||
- **Interaction** : environnement orienté **intentions** et **opérations** (grammaire de commandes), pas une navigation fichier comme flux nominal ; mode expert pour l’arborescence. Voir [ux-navigation-model.md](./ux-navigation-model.md).
|
||||
- **IA et mémoire** : **Ollama** pour l’inférence locale ; **AnythingLLM** pour le RAG et la mémoire documentaire **par projet** ; les **agents `ia_dev`** restent le noyau métier et opératoire.
|
||||
- **Encadrement** : **OpenShell** / policy-runtime (sandboxes, droits nommés, refus explicites, pas de fallback implicite non spécifié).
|
||||
- **Édition** : **socle applicatif** **Lapce** sous **`core_ide/`** (clone local hors index Git) — voir [core-ide.md](./core-ide.md).
|
||||
- **Métier documentaire** : **ONLYOFFICE** pour les flux bureautiques riches ; **`services/local-office/`** pour une **API programmatique** (docx via commandes, stockage local, clés API) — voir [features/local-office.md](./features/local-office.md).
|
||||
- **Déploiement cible** : soit poste **Linux client** + **SSH** vers un serveur (socle IA + repos), soit **machine IA unique** (Ollama et AnythingLLM sur le même hôte que les services). Les deux variantes sont décrites dans [deployment-target.md](./deployment-target.md). Les déploiements **test / pprod / prod** : [platform-target.md](./platform-target.md).
|
||||
|
||||
## Monorepo unique
|
||||
|
||||
Le **référentiel de vérité** pour l’écosystème décrit ici est **un seul dépôt Git** (`smart_ide`) : specs, services locaux, scripts, extensions, documentation, et arborescence éditeur vendue. **Les produits et livrables 4NK ne sont pas hébergés sur GitHub** ; la forge canonique est **interne** (ex. Gitea). Les dépôts publics (Lapce, bibliothèques Python, etc.) ne sont que des **amonts** éventuels pour import ou relecture, pas des cibles de publication obligatoires.
|
||||
|
||||
Conséquences :
|
||||
|
||||
- Les répertoires sous `services/` font partie du **même cycle de vie** que le reste du monorepo (revue, déploiement, systemd).
|
||||
- **`core_ide/`** est un **clone local** de l’éditeur **Lapce** (socle applicatif), présent **dans l’arborescence du monorepo** sur disque ; il est **exclu de l’index Git du parent** par volumétrie (voir racine `.gitignore`). Mise à jour : procédure dans [core-ide.md](./core-ide.md).
|
||||
- `ia_dev` est aujourd’hui un **sous-module** pointant vers la forge 4NK ([ia_dev-submodule.md](./ia_dev-submodule.md)). Un service HTTP **`ia-dev-gateway`** ([features/ia-dev-service.md](./features/ia-dev-service.md)) exposera le registre et les exécutions agents ; trajectoire documentée dans le sous-module. Si la politique « un seul historique » devient stricte, la trajectoire possible est **fusion** du contenu agent dans ce dépôt (à planifier), en conservant la même séparation logique des dossiers.
|
||||
- **Orchestrateur** HTTP : [features/orchestrator-api.md](./features/orchestrator-api.md) — serveur stub sous `services/smart-ide-orchestrator/` ; routage intentions → Ollama, AnythingLLM, micro-services, `ia-dev-gateway` (forward HTTP à compléter).
|
||||
|
||||
## Cartographie des ressources (arborescence)
|
||||
|
||||
| Chemin | Rôle dans l’architecture |
|
||||
|--------|---------------------------|
|
||||
| `docs/` , `docs/features/` | Documentation technique et fonctionnalités ; point d’entrée unique pour les specs |
|
||||
| `services/langextract-api/` | **Extraction structurée** depuis texte (LLM), API HTTP locale pour gateway / outils |
|
||||
| `services/agent-regex-search-api/` | **Recherche regex** sur fichiers via ripgrep, périmètre borné par `REGEX_SEARCH_ROOT` |
|
||||
| `services/claw-harness-api/` | **Harnais** optionnel multi-fournisseur (amont claw-code) + proxy ; gabarits **sans Anthropic** |
|
||||
| `services/repos-devtools-server/` | **Outillage Git** HTTP local (clone, liste, chargement de dépôts sous racine contrôlée) |
|
||||
| `core_ide/` | **Sources Lapce** — socle applicatif (build éditeur, personnalisations) — clone amont, hors index du parent |
|
||||
| `extensions/anythingllm-workspaces/` | Outils / modèles alignés AnythingLLM et workspaces par projet |
|
||||
| `scripts/` , `setup/` , `systemd/` | Installation hôte, scripts d’exploitation, unités utilisateur pour services |
|
||||
| `services/local-office/` | **API REST** Office (upload, commandes docx, stockage SQLite + fichiers) ; complément programmatique à ONLYOFFICE |
|
||||
| `ia_dev/` | Agents, `projects/<id>/`, déploiements — exécution sous policy ; voir sous-module |
|
||||
| `services/ia-dev-gateway/` | Gateway HTTP (stub runner) : registre agents `.md`, runs, SSE — [features/ia-dev-service.md](./features/ia-dev-service.md) |
|
||||
| `services/smart-ide-orchestrator/` | Routage intentions (stub forward) — [features/orchestrator-api.md](./features/orchestrator-api.md) |
|
||||
|
||||
## Environnements test, pprod, prod
|
||||
|
||||
Chaque environnement possède ses **URLs**, **secrets** et **politiques** (AnythingLLM, tokens micro-services, OIDC front — docv). Pas de configuration sensible dans le dépôt : `.secrets/<env>/` ou variables d’hébergement. Détail : [platform-target.md](./platform-target.md).
|
||||
|
||||
## Checkout `ia_dev` dans ce dépôt
|
||||
|
||||
Le dépôt [**ia_dev**](https://git.4nkweb.com/4nk/ia_dev.git) est relié à `smart_ide` comme **sous-module Git** sous `./ia_dev` (forge 4NK, pas GitHub). Sur le serveur SSH, l’**agent gateway** et les outils peuvent pointer vers ce chemin comme racine d’exécution des agents (tout script invoqué depuis la racine `ia_dev`, comme documenté en amont). Voir [ia_dev-submodule.md](./ia_dev-submodule.md).
|
||||
|
||||
## Répartition physique (première cible)
|
||||
|
||||
Pour le **premier déploiement**, un **poste Linux** (client) établit des sessions **SSH** vers un **serveur** qui concentre :
|
||||
@ -21,7 +68,14 @@ L’**éditeur** et une partie de l’UX peuvent rester sur le client ; le **gat
|
||||
| **AnythingLLM** | **Mémoire documentaire** et RAG ; **un workspace par projet** ([anythingllm-workspaces.md](./anythingllm-workspaces.md)) |
|
||||
| **ONLYOFFICE** | Backend **documentaire métier** (documents, feuilles, présentations) |
|
||||
|
||||
Flux type : demande utilisateur → orchestrateur → préparation (scripts / tools génériques) → agents → besoin LLM → Ollama ; besoin doc / RAG → AnythingLLM ; besoin bureautique → ONLYOFFICE.
|
||||
Flux type : demande utilisateur → orchestrateur → préparation (scripts / tools génériques) → agents → besoin LLM → Ollama ; besoin doc / RAG → AnythingLLM ; besoin bureautique riche → ONLYOFFICE ; besoin **fichier Office manipulé par API** (tiers, scripts, agents) → **`services/local-office/`** (Local Office).
|
||||
|
||||
Enrichissement possible du routage (sans changer le principe « orchestrateur = logique de flux, pas LLM ») :
|
||||
|
||||
- besoin d’**entités typées dans du texte** (preuves, champs métiers, grounding) → **`langextract-api`** puis post-traitement métier ;
|
||||
- besoin de **recherche symbolique** (symboles, motifs, grep sémantique opérationnel) → **`agent-regex-search-api`** dans la limite du root autorisé ;
|
||||
- besoin d’un **runtime harnais** unifié multi-modèle (hors Anthropic si politique projet) → **`claw-harness-api`** / binaire amont + proxy ;
|
||||
- besoin de **cloner ou rafraîchir** un dépôt dans l’espace de code autorisé → **`repos-devtools-server`**.
|
||||
|
||||
## Orchestrateur — décisions de routage
|
||||
|
||||
@ -33,6 +87,7 @@ Décider notamment :
|
||||
- quand `ia_dev` peut **escalader** vers Ollama ;
|
||||
- quand interroger **AnythingLLM** ;
|
||||
- quand passer par **ONLYOFFICE** ;
|
||||
- quand passer par **Local Office** (`services/local-office/`) (édition programmatique / intégration tierce légère) ;
|
||||
- quand **refuser**.
|
||||
|
||||
## Descripteur stable par agent
|
||||
@ -54,7 +109,7 @@ Sortie **événementielle uniforme** vers l’éditeur ; **journalisation** ; po
|
||||
- Chaque nouvelle résolution devrait pouvoir devenir **recipe**, **tool** ou **sous-agent** stable (travail des commandes UX de haut niveau).
|
||||
- Les agents ne devraient **pas** exécuter directement sur l’hôte sans contrôle : **sandboxes** avec droits dérivés du **type d’agent** et du **projet**.
|
||||
|
||||
Exemples de **profils de policy** : lecture seule ; lecture + scripts locaux ; écriture bornée ; déploiement pprod / prod ; génération documentaire ; accès tickets ; accès base ; accès ONLYOFFICE.
|
||||
Exemples de **profils de policy** : lecture seule ; lecture + scripts locaux ; écriture bornée ; déploiement pprod / prod ; génération documentaire ; accès tickets ; accès base ; accès ONLYOFFICE ; **accès API Local Office** (clé API dédiée, périmètre réseau).
|
||||
|
||||
**Pas de fallback implicite** non spécifié : refus ou erreur explicite selon les règles du projet.
|
||||
|
||||
@ -68,7 +123,7 @@ Sans cette couche, l’IDE reste dépendant de **conventions implicites** du dé
|
||||
|
||||
## Agent gateway (adaptateur)
|
||||
|
||||
Ne pas brancher le dépôt `ia_dev` directement dans l’éditeur : passer par une **agent gateway** qui :
|
||||
Ne pas brancher le dépôt `ia_dev` directement dans l’éditeur : passer par une **agent gateway** (implémentation cible : service **`ia-dev-gateway`**, [API/ia-dev-gateway.md](./API/ia-dev-gateway.md)) qui :
|
||||
|
||||
1. Charge le **registre**
|
||||
2. **Valide** les permissions
|
||||
@ -90,18 +145,67 @@ Ne pas brancher le dépôt `ia_dev` directement dans l’éditeur : passer par u
|
||||
| **agent-gateway** | Adaptateur uniforme UX ↔ `ia_dev` |
|
||||
| **policy-runtime** | OpenShell, profils de policy, providers, sandboxes, journaux |
|
||||
| **knowledge-services** | AnythingLLM, mémoire projet, index documentaire, routage RAG |
|
||||
| **doc-services** | ONLYOFFICE, flux `present`, `write`, `sheet` |
|
||||
| **doc-services** | ONLYOFFICE, flux `present`, `write`, `sheet` ; **Local Office** (`services/local-office/`, API upload / commandes docx) |
|
||||
|
||||
Les **agents** restent le noyau opératoire ; les modules encadrent et exposent.
|
||||
|
||||
## Micro-services HTTP locaux (extensions)
|
||||
|
||||
Services d’écoute sur **`127.0.0.1`** (souvent avec **`Authorization: Bearer`**) pour Lapce (une fois branché), scripts, gateway ou agents :
|
||||
|
||||
| Service | Rôle | Doc |
|
||||
|--------|------|-----|
|
||||
| **repos-devtools-server** | Git : clone / liste / load sous `REPOS_DEVTOOLS_ROOT` | [services/repos-devtools-server/README.md](../services/repos-devtools-server/README.md) |
|
||||
| **langextract-api** | Extraction structurée depuis texte (wrapper [LangExtract](https://github.com/google/langextract)) | [features/langextract-api.md](./features/langextract-api.md) |
|
||||
| **claw-harness-api** | Harnais amont : build / politique sans Anthropic ; proxy optionnel vers serveur HTTP claw-code | [features/claw-harness-api.md](./features/claw-harness-api.md) |
|
||||
| **agent-regex-search-api** | Recherche regex sur fichiers via **ripgrep** sous `REGEX_SEARCH_ROOT` | [features/agent-regex-search-api.md](./features/agent-regex-search-api.md) |
|
||||
| **local-office** | API REST Office : upload, commandes docx ; auth **`X-API-Key`** (pas Bearer) | [features/local-office.md](./features/local-office.md) , [services/local-office/README.md](../services/local-office/README.md) |
|
||||
| **ia-dev-gateway** | Registre agents, runs stub, flux SSE (Node/TS) | [features/ia-dev-service.md](./features/ia-dev-service.md) , [API/ia-dev-gateway.md](./API/ia-dev-gateway.md) , [services/ia-dev-gateway/README.md](../services/ia-dev-gateway/README.md) |
|
||||
| **smart_ide-orchestrator** | Routage intentions, timeline (Node/TS ; forward à compléter) | [features/orchestrator-api.md](./features/orchestrator-api.md) , [API/orchestrator.md](./API/orchestrator.md) , [services/smart-ide-orchestrator/README.md](../services/smart-ide-orchestrator/README.md) |
|
||||
|
||||
Ces services sont des **adapters** : pas de logique métier produit au-delà de la validation d’entrée et de l’appel au moteur (Git, `rg`, LangExtract, proxy, stockage fichiers / docx). La **policy** (qui peut appeler quoi, sur quel chemin) reste du ressort du **policy-runtime** et du **gateway**. **Local Office** suit la même logique ; le schéma d’auth diffère (`X-API-Key` vs `Authorization: Bearer` pour les autres lignes du tableau).
|
||||
|
||||
## Vue logique monorepo (extraits)
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph monorepo ["smart_ide monorepo"]
|
||||
docs[docs]
|
||||
svc[services]
|
||||
coreIde[core_ide]
|
||||
ia[ia_dev_submodule]
|
||||
ext[extensions]
|
||||
scr[scripts_setup_systemd]
|
||||
end
|
||||
subgraph svcDetail [services_detail]
|
||||
rds[repos-devtools-server]
|
||||
le[langextract-api]
|
||||
rg[agent-regex-search-api]
|
||||
ch[claw-harness-api]
|
||||
lo[local-office]
|
||||
iagw[ia-dev-gateway]
|
||||
end
|
||||
svc --> svcDetail
|
||||
gateway[agent_gateway_HTTP]
|
||||
orch[orchestrator_HTTP]
|
||||
gateway --> svcDetail
|
||||
gateway --> iagw
|
||||
orch --> gateway
|
||||
orch --> lo
|
||||
orch --> iagw
|
||||
coreIde --> editor[editor-shell_Lapce_build]
|
||||
```
|
||||
|
||||
Un service **navigateur embarqué** (Chromium / Playwright) n’est **pas** requis par défaut ; critères d’introduction : [features/browser-automation-criteria.md](./features/browser-automation-criteria.md).
|
||||
|
||||
## Socle applicatif : Lapce (`core_ide/`)
|
||||
|
||||
Les **sources** de Lapce vivent sous **`core_ide/`** (clone de l’amont public, Apache-2.0) — **socle applicatif** de l’IDE. Le **binaire** ou paquet installé pour l’utilisateur est produit par **build local** à partir de cet arbre (ou livré par paquet système, selon politique d’équipe). Clone, mise à jour et build : [core-ide.md](./core-ide.md). La personnalisation UX (intentions, appels aux micro-services) se fait dans la couche **editor-shell / orchestrateur** et n’impose pas un second dépôt produit. Portage de l’extension VS Code AnythingLLM vers Lapce : [features/lapce-porting-roadmap.md](./features/lapce-porting-roadmap.md).
|
||||
|
||||
## UX — masquage des agents
|
||||
|
||||
L’utilisateur ne « choisit pas un agent » dans le flux nominal : il exprime une **intention** (`ask`, `fix`, …). Le **routeur** sélectionne l’agent ou la chaîne d’agents.
|
||||
|
||||
## Socle éditeur : Lapce
|
||||
|
||||
**Lapce** (open source, Rust, rendu natif / GPU) est le candidat retenu pour un **éditeur rapide et léger** avec agents, au lieu d’un IDE historique très chargé. Positionnement aligné avec le rôle « coquille + orchestration + transparence contextuelle » décrit dans [ux-navigation-model.md](./ux-navigation-model.md).
|
||||
|
||||
## Taxonomie des droits
|
||||
|
||||
Les droits doivent être **nommés**, **vérifiables** et **traçables** (lien avec OpenShell et le registre d’agents). Pas de contournement par défaut.
|
||||
|
||||
@ -49,7 +49,7 @@ L’utilisateur voit en permanence **sur quoi il travaille** : projet courant, t
|
||||
|
||||
### Recherche structurée (au-delà du grep fichiers)
|
||||
|
||||
Capacité de chercher dans : code, **symboles**, recettes, outils, **historiques de session**, tickets, documents, logs, artefacts ONLYOFFICE, sorties d’agents, **mémoire AnythingLLM**.
|
||||
Capacité de chercher dans : code, **symboles**, recettes, outils, **historiques de session**, tickets, documents, logs, artefacts ONLYOFFICE, **métadonnées / références aux documents gérés par Local Office** (API `services/local-office/`), sorties d’agents, **mémoire AnythingLLM**.
|
||||
|
||||
### Vue artefacts
|
||||
|
||||
|
||||
1
ia_dev
Submodule
1
ia_dev
Submodule
@ -0,0 +1 @@
|
||||
Subproject commit e8c0db220005ca7e670e496931e293b57bc63d9c
|
||||
23
scripts/anythingllm-docker-exec.sh
Executable file
23
scripts/anythingllm-docker-exec.sh
Executable file
@ -0,0 +1,23 @@
|
||||
#!/bin/bash
|
||||
# Run AnythingLLM container in foreground (for systemd Type=simple).
|
||||
# Environment: ANYTHINGLLM_STORAGE, ANYTHINGLLM_PORT, ANYTHINGLLM_IMAGE, ANYTHINGLLM_NAME
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
STORAGE="${ANYTHINGLLM_STORAGE:-/home/ncantu/anythingllm}"
|
||||
PORT="${ANYTHINGLLM_PORT:-3001}"
|
||||
IMAGE="${ANYTHINGLLM_IMAGE:-mintplexlabs/anythingllm}"
|
||||
NAME="${ANYTHINGLLM_NAME:-anythingllm}"
|
||||
|
||||
mkdir -p "${STORAGE}"
|
||||
touch "${STORAGE}/.env"
|
||||
|
||||
exec docker run --rm \
|
||||
--name "${NAME}" \
|
||||
-p "${PORT}:3001" \
|
||||
--cap-add SYS_ADMIN \
|
||||
--add-host=host.docker.internal:host-gateway \
|
||||
-v "${STORAGE}:/app/server/storage" \
|
||||
-v "${STORAGE}/.env:/app/server/.env" \
|
||||
-e STORAGE_DIR=/app/server/storage \
|
||||
"${IMAGE}"
|
||||
58
scripts/anythingllm-pull-sync/README.md
Normal file
58
scripts/anythingllm-pull-sync/README.md
Normal file
@ -0,0 +1,58 @@
|
||||
# anythingllm-pull-sync
|
||||
|
||||
Runs after **`git pull`** (Git hook **`post-merge`**) to upload **files changed** between `ORIG_HEAD` and `HEAD` to an AnythingLLM workspace via `POST /api/v1/document/upload`.
|
||||
|
||||
## Requirements
|
||||
|
||||
- AnythingLLM **collector / document processor** online.
|
||||
- Same **`.4nkaiignore`** rules as the VS Code extension (repo root).
|
||||
- **Environment** (see below): base URL, API key, workspace slug (or `.anythingllm.json`).
|
||||
|
||||
## Environment
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `ANYTHINGLLM_BASE_URL` | yes | e.g. `https://ia.enso.4nkweb.com/anythingllm` (no trailing `/`) |
|
||||
| `ANYTHINGLLM_API_KEY` | yes | Developer API key (Settings → API Keys) |
|
||||
| `ANYTHINGLLM_WORKSPACE_SLUG` | no* | Workspace slug |
|
||||
| `ANYTHINGLLM_SYNC_MAX_FILES` | no | Default `200` per run |
|
||||
| `ANYTHINGLLM_SYNC_MAX_FILE_BYTES` | no | Default `5242880` |
|
||||
|
||||
\* If unset, the script reads **`repo/.anythingllm.json`**: `{ "workspaceSlug": "my-slug" }`.
|
||||
|
||||
Optional: create **`~/.config/4nk/anythingllm-sync.env`**:
|
||||
|
||||
```sh
|
||||
export ANYTHINGLLM_BASE_URL='https://ia.enso.4nkweb.com/anythingllm'
|
||||
export ANYTHINGLLM_API_KEY='…'
|
||||
# export ANYTHINGLLM_WORKSPACE_SLUG='algo' # optional if .anythingllm.json exists
|
||||
```
|
||||
|
||||
The generated hook sources this file when present.
|
||||
|
||||
## Install the hook in a repository
|
||||
|
||||
From the machine that has **`smart_ide`**:
|
||||
|
||||
```bash
|
||||
/home/ncantu/code/smart_ide/scripts/install-anythingllm-post-merge-hook.sh /home/ncantu/code/algo
|
||||
/home/ncantu/code/smart_ide/scripts/install-anythingllm-post-merge-hook.sh /home/ncantu/code/builazoo
|
||||
```
|
||||
|
||||
Then once per machine:
|
||||
|
||||
```bash
|
||||
cd /home/ncantu/code/smart_ide/scripts/anythingllm-pull-sync && npm install
|
||||
```
|
||||
|
||||
## Behaviour
|
||||
|
||||
- **Only** paths produced by `git diff --name-only --diff-filter=ACMRT ORIG_HEAD HEAD` (added/changed, not deletions).
|
||||
- If `ORIG_HEAD` is missing or env/slug is missing, the script **exits 0** and prints a message (your pull is not blocked).
|
||||
- **Deletions / renames** are not mirrored as removals in AnythingLLM in this version (upload-only).
|
||||
|
||||
## Uninstall
|
||||
|
||||
```bash
|
||||
rm -f /path/to/repo/.git/hooks/post-merge
|
||||
```
|
||||
25
scripts/anythingllm-pull-sync/package-lock.json
generated
Normal file
25
scripts/anythingllm-pull-sync/package-lock.json
generated
Normal file
@ -0,0 +1,25 @@
|
||||
{
|
||||
"name": "@4nk/anythingllm-pull-sync",
|
||||
"version": "0.1.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "@4nk/anythingllm-pull-sync",
|
||||
"version": "0.1.0",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"ignore": "^5.3.2"
|
||||
}
|
||||
},
|
||||
"node_modules/ignore": {
|
||||
"version": "5.3.2",
|
||||
"resolved": "https://registry.npmjs.org/ignore/-/ignore-5.3.2.tgz",
|
||||
"integrity": "sha512-hsBTNUqQTDwkWtcdYI2i06Y/nUBEsNEDJKjWdigLvegy8kDuJAS8uRlpkkcQpyEXL0Z/pjDy5HBmMjRCJ2gq+g==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 4"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
11
scripts/anythingllm-pull-sync/package.json
Normal file
11
scripts/anythingllm-pull-sync/package.json
Normal file
@ -0,0 +1,11 @@
|
||||
{
|
||||
"name": "@4nk/anythingllm-pull-sync",
|
||||
"private": true,
|
||||
"version": "0.1.0",
|
||||
"type": "module",
|
||||
"description": "Post-pull sync of changed files to AnythingLLM (multipart upload API).",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"ignore": "^5.3.2"
|
||||
}
|
||||
}
|
||||
206
scripts/anythingllm-pull-sync/sync.mjs
Executable file
206
scripts/anythingllm-pull-sync/sync.mjs
Executable file
@ -0,0 +1,206 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Upload files changed between ORIG_HEAD and HEAD to AnythingLLM (post-merge / after pull).
|
||||
* Requires: ANYTHINGLLM_BASE_URL, ANYTHINGLLM_API_KEY, workspace slug via ANYTHINGLLM_WORKSPACE_SLUG or .anythingllm.json
|
||||
*/
|
||||
import { execFileSync } from "node:child_process";
|
||||
import * as fs from "node:fs";
|
||||
import * as fsPromises from "node:fs/promises";
|
||||
import * as path from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
import ignore from "ignore";
|
||||
|
||||
const __dirname = path.dirname(fileURLToPath(import.meta.url));
|
||||
|
||||
const ALWAYS_IGNORE = [".git/", "node_modules/", "**/node_modules/"].join("\n");
|
||||
|
||||
const readJson = (p) => {
|
||||
const raw = fs.readFileSync(p, "utf8");
|
||||
return JSON.parse(raw);
|
||||
};
|
||||
|
||||
const git = (repoRoot, args) => {
|
||||
return execFileSync("git", args, {
|
||||
cwd: repoRoot,
|
||||
encoding: "utf8",
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
}).trim();
|
||||
};
|
||||
|
||||
const parseArgs = () => {
|
||||
const out = { repoRoot: process.cwd() };
|
||||
const argv = process.argv.slice(2);
|
||||
for (let i = 0; i < argv.length; i += 1) {
|
||||
if (argv[i] === "--repo-root" && argv[i + 1]) {
|
||||
out.repoRoot = path.resolve(argv[i + 1]);
|
||||
i += 1;
|
||||
}
|
||||
}
|
||||
return out;
|
||||
};
|
||||
|
||||
const loadWorkspaceSlug = (repoRoot) => {
|
||||
const env = process.env.ANYTHINGLLM_WORKSPACE_SLUG?.trim();
|
||||
if (env) {
|
||||
return env;
|
||||
}
|
||||
const cfgPath = path.join(repoRoot, ".anythingllm.json");
|
||||
try {
|
||||
const j = readJson(cfgPath);
|
||||
if (typeof j.workspaceSlug === "string" && j.workspaceSlug.trim().length > 0) {
|
||||
return j.workspaceSlug.trim();
|
||||
}
|
||||
} catch {
|
||||
/* missing */
|
||||
}
|
||||
return "";
|
||||
};
|
||||
|
||||
const normalizeApiKey = (raw) => {
|
||||
const t = raw.trim();
|
||||
const m = /^Bearer\s+/i.exec(t);
|
||||
return m ? t.slice(m[0].length).trim() : t;
|
||||
};
|
||||
|
||||
const uploadOne = async (baseUrl, apiKey, slug, absPath, uploadName) => {
|
||||
const root = baseUrl.replace(/\/+$/, "");
|
||||
const buf = await fsPromises.readFile(absPath);
|
||||
const body = new FormData();
|
||||
body.append("file", new Blob([buf]), uploadName);
|
||||
body.append("addToWorkspaces", slug);
|
||||
const res = await fetch(`${root}/api/v1/document/upload`, {
|
||||
method: "POST",
|
||||
headers: { Authorization: `Bearer ${apiKey}` },
|
||||
body,
|
||||
});
|
||||
const text = await res.text();
|
||||
let parsed;
|
||||
try {
|
||||
parsed = JSON.parse(text);
|
||||
} catch {
|
||||
throw new Error(`non-JSON ${res.status}: ${text.slice(0, 200)}`);
|
||||
}
|
||||
if (!res.ok || parsed.success !== true) {
|
||||
throw new Error(`${res.status}: ${text.slice(0, 400)}`);
|
||||
}
|
||||
};
|
||||
|
||||
const main = async () => {
|
||||
const { repoRoot } = parseArgs();
|
||||
const baseUrl = process.env.ANYTHINGLLM_BASE_URL?.trim() ?? "";
|
||||
const apiKeyRaw = process.env.ANYTHINGLLM_API_KEY?.trim() ?? "";
|
||||
const maxBytes = Number(process.env.ANYTHINGLLM_SYNC_MAX_FILE_BYTES ?? 5242880);
|
||||
const maxFiles = Number(process.env.ANYTHINGLLM_SYNC_MAX_FILES ?? 200);
|
||||
|
||||
if (!baseUrl || !apiKeyRaw) {
|
||||
console.error(
|
||||
"anythingllm-pull-sync: missing ANYTHINGLLM_BASE_URL or ANYTHINGLLM_API_KEY — skip.",
|
||||
);
|
||||
process.exit(0);
|
||||
}
|
||||
const apiKey = normalizeApiKey(apiKeyRaw);
|
||||
const slug = loadWorkspaceSlug(repoRoot);
|
||||
if (!slug) {
|
||||
console.error(
|
||||
"anythingllm-pull-sync: set ANYTHINGLLM_WORKSPACE_SLUG or .anythingllm.json { \"workspaceSlug\": \"…\" } — skip.",
|
||||
);
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
try {
|
||||
git(repoRoot, ["rev-parse", "-q", "--verify", "ORIG_HEAD"]);
|
||||
} catch {
|
||||
console.error("anythingllm-pull-sync: no ORIG_HEAD (not a merge/pull) — skip.");
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
let names;
|
||||
try {
|
||||
const out = git(repoRoot, [
|
||||
"diff",
|
||||
"--name-only",
|
||||
"--diff-filter=ACMRT",
|
||||
"ORIG_HEAD",
|
||||
"HEAD",
|
||||
]);
|
||||
names = out.length > 0 ? out.split("\n").filter(Boolean) : [];
|
||||
} catch (e) {
|
||||
console.error("anythingllm-pull-sync: git diff failed — skip.", e.message);
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
if (names.length === 0) {
|
||||
console.error("anythingllm-pull-sync: no file changes between ORIG_HEAD and HEAD.");
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
const ignorePath = path.join(repoRoot, ".4nkaiignore");
|
||||
let userRules = "";
|
||||
try {
|
||||
userRules = await fsPromises.readFile(ignorePath, "utf8");
|
||||
} catch {
|
||||
userRules = "";
|
||||
}
|
||||
const ig = ignore();
|
||||
ig.add(ALWAYS_IGNORE);
|
||||
ig.add(userRules);
|
||||
|
||||
let uploaded = 0;
|
||||
let skipped = 0;
|
||||
const errors = [];
|
||||
|
||||
for (const rel of names) {
|
||||
if (rel.includes("..") || path.isAbsolute(rel)) {
|
||||
skipped += 1;
|
||||
continue;
|
||||
}
|
||||
const posix = rel.split(path.sep).join("/");
|
||||
if (ig.ignores(posix)) {
|
||||
skipped += 1;
|
||||
continue;
|
||||
}
|
||||
const abs = path.join(repoRoot, rel);
|
||||
let st;
|
||||
try {
|
||||
st = await fsPromises.stat(abs);
|
||||
} catch {
|
||||
skipped += 1;
|
||||
continue;
|
||||
}
|
||||
if (!st.isFile()) {
|
||||
skipped += 1;
|
||||
continue;
|
||||
}
|
||||
if (st.size > maxBytes) {
|
||||
skipped += 1;
|
||||
continue;
|
||||
}
|
||||
if (uploaded >= maxFiles) {
|
||||
console.error("anythingllm-pull-sync: cap reached (ANYTHINGLLM_SYNC_MAX_FILES).");
|
||||
break;
|
||||
}
|
||||
const uploadName = posix.split("/").join("__");
|
||||
try {
|
||||
await uploadOne(baseUrl, apiKey, slug, abs, uploadName);
|
||||
uploaded += 1;
|
||||
} catch (e) {
|
||||
errors.push(`${posix}: ${e instanceof Error ? e.message : String(e)}`);
|
||||
}
|
||||
}
|
||||
|
||||
console.error(
|
||||
`anythingllm-pull-sync: uploaded=${uploaded} skipped=${skipped} errors=${errors.length}`,
|
||||
);
|
||||
for (const line of errors.slice(0, 20)) {
|
||||
console.error(line);
|
||||
}
|
||||
if (errors.length > 20) {
|
||||
console.error(`… ${errors.length - 20} more`);
|
||||
}
|
||||
process.exit(0);
|
||||
};
|
||||
|
||||
main().catch((e) => {
|
||||
console.error("anythingllm-pull-sync:", e);
|
||||
process.exit(1);
|
||||
});
|
||||
42
scripts/install-anythingllm-post-merge-hook.sh
Executable file
42
scripts/install-anythingllm-post-merge-hook.sh
Executable file
@ -0,0 +1,42 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [[ $# -lt 1 ]]; then
|
||||
echo "Usage: $0 <path-to-git-repo> [<path-to-smart_ide>]" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
REPO=$(cd "$1" && pwd)
|
||||
SMART_IDE_ROOT=${2:-}
|
||||
if [[ -z "$SMART_IDE_ROOT" ]]; then
|
||||
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
SMART_IDE_ROOT=$(cd "$SCRIPT_DIR/.." && pwd)
|
||||
fi
|
||||
|
||||
SYNC_DIR="$SMART_IDE_ROOT/scripts/anythingllm-pull-sync"
|
||||
HOOK="$REPO/.git/hooks/post-merge"
|
||||
|
||||
if [[ ! -d "$REPO/.git" ]]; then
|
||||
echo "Not a git repository: $REPO" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ ! -f "$SYNC_DIR/sync.mjs" ]]; then
|
||||
echo "Missing $SYNC_DIR/sync.mjs" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
mkdir -p "$(dirname "$HOOK")"
|
||||
cat >"$HOOK" <<EOF
|
||||
#!/usr/bin/env sh
|
||||
# Installed by install-anythingllm-post-merge-hook.sh — AnythingLLM upload after pull (post-merge)
|
||||
REPO_ROOT=\$(git rev-parse --show-toplevel)
|
||||
if [ -f "\${HOME}/.config/4nk/anythingllm-sync.env" ]; then
|
||||
# shellcheck source=/dev/null
|
||||
. "\${HOME}/.config/4nk/anythingllm-sync.env"
|
||||
fi
|
||||
exec node "$SYNC_DIR/sync.mjs" --repo-root "\$REPO_ROOT"
|
||||
EOF
|
||||
chmod +x "$HOOK"
|
||||
echo "Installed post-merge hook: $HOOK"
|
||||
echo "Run: (cd $SYNC_DIR && npm install) if node_modules is missing."
|
||||
27
scripts/install-systemd-services.sh
Executable file
27
scripts/install-systemd-services.sh
Executable file
@ -0,0 +1,27 @@
|
||||
#!/bin/bash
|
||||
# Install AnythingLLM systemd unit and helper script. Ollama is managed by the official
|
||||
# ollama.service (this script only ensures it is enabled).
|
||||
# Run: sudo ./install-systemd-services.sh
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [ "$(id -u)" -ne 0 ]; then
|
||||
echo "Run as root: sudo $0" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
ROOT="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
install -m 0755 "${ROOT}/scripts/anythingllm-docker-exec.sh" /usr/local/sbin/anythingllm-docker-exec.sh
|
||||
install -m 0644 "${ROOT}/systemd/anythingllm.service" /etc/systemd/system/anythingllm.service
|
||||
if [ ! -f /etc/default/anythingllm ]; then
|
||||
install -m 0644 "${ROOT}/systemd/anythingllm.default" /etc/default/anythingllm
|
||||
fi
|
||||
|
||||
systemctl daemon-reload
|
||||
systemctl enable ollama.service
|
||||
systemctl enable anythingllm.service
|
||||
systemctl restart ollama.service
|
||||
systemctl restart anythingllm.service
|
||||
|
||||
echo "Status ollama: $(systemctl is-active ollama)"
|
||||
echo "Status anythingllm: $(systemctl is-active anythingllm)"
|
||||
2
services/agent-regex-search-api/.gitignore
vendored
Normal file
2
services/agent-regex-search-api/.gitignore
vendored
Normal file
@ -0,0 +1,2 @@
|
||||
dist/
|
||||
node_modules/
|
||||
49
services/agent-regex-search-api/README.md
Normal file
49
services/agent-regex-search-api/README.md
Normal file
@ -0,0 +1,49 @@
|
||||
# agent-regex-search-api
|
||||
|
||||
Local HTTP API on **`127.0.0.1`** for **regex search over files** using [ripgrep](https://github.com/BurntSushi/ripgrep) (`rg`). Results are returned as structured JSON.
|
||||
|
||||
This is **not** the closed-source “instant grep” index described in Cursor’s article ([Recherche regex rapide](https://cursor.com/fr/blog/fast-regex-search)); it is a **local, open** approach (ripgrep) with the same high-level goal: fast agent-oriented code search. For monorepos at extreme scale, consider adding **Zoekt** or another indexed backend later (see feature doc).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- `rg` available in `PATH` (e.g. `sudo apt install ripgrep` on Debian/Ubuntu).
|
||||
|
||||
## Environment
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `REGEX_SEARCH_TOKEN` | yes | `Authorization: Bearer <token>` on every request except `GET /health`. |
|
||||
| `REGEX_SEARCH_ROOT` | no | Absolute base directory searches are confined to (default `/home/ncantu/code`). |
|
||||
| `REGEX_SEARCH_HOST` | no | Bind address (default `127.0.0.1`). |
|
||||
| `REGEX_SEARCH_PORT` | no | Port (default `37143`). |
|
||||
|
||||
## Endpoints
|
||||
|
||||
- `GET /health` — liveness; includes configured `root` path.
|
||||
- `POST /search` — JSON body:
|
||||
- `pattern` (string, required): Rust regex passed to ripgrep.
|
||||
- `subpath` (string, optional): path **relative** to `REGEX_SEARCH_ROOT` (no `..`, no absolute paths).
|
||||
- `maxMatches` (number, optional): cap on matches (default `500`, max `50000`).
|
||||
- `timeoutMs` (number, optional): kill `rg` after this many ms (default `60000`, max `300000`).
|
||||
|
||||
Response: `{ root, target, matches: [{ path, lineNumber, line }], truncated, exitCode }`.
|
||||
|
||||
Ripgrep exit code `1` means “no matches” and is still returned as **200** with an empty `matches` array when no other error occurred.
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
npm install
|
||||
npm run build
|
||||
export REGEX_SEARCH_TOKEN='…'
|
||||
npm start
|
||||
```
|
||||
|
||||
## Risks
|
||||
|
||||
- **ReDoS**: pathological regexes can burn CPU until `timeoutMs`. Keep timeouts conservative for shared hosts.
|
||||
- **Scope**: all readable files under `target` that ripgrep traverses may be searched; align `REGEX_SEARCH_ROOT` with policy.
|
||||
|
||||
## License
|
||||
|
||||
MIT.
|
||||
51
services/agent-regex-search-api/package-lock.json
generated
Normal file
51
services/agent-regex-search-api/package-lock.json
generated
Normal file
@ -0,0 +1,51 @@
|
||||
{
|
||||
"name": "@4nk/agent-regex-search-api",
|
||||
"version": "0.1.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "@4nk/agent-regex-search-api",
|
||||
"version": "0.1.0",
|
||||
"license": "MIT",
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.11.0",
|
||||
"typescript": "^5.3.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20"
|
||||
}
|
||||
},
|
||||
"node_modules/@types/node": {
|
||||
"version": "20.19.39",
|
||||
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.39.tgz",
|
||||
"integrity": "sha512-orrrD74MBUyK8jOAD/r0+lfa1I2MO6I+vAkmAWzMYbCcgrN4lCrmK52gRFQq/JRxfYPfonkr4b0jcY7Olqdqbw==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"undici-types": "~6.21.0"
|
||||
}
|
||||
},
|
||||
"node_modules/typescript": {
|
||||
"version": "5.9.3",
|
||||
"resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz",
|
||||
"integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"bin": {
|
||||
"tsc": "bin/tsc",
|
||||
"tsserver": "bin/tsserver"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=14.17"
|
||||
}
|
||||
},
|
||||
"node_modules/undici-types": {
|
||||
"version": "6.21.0",
|
||||
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
|
||||
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
}
|
||||
}
|
||||
}
|
||||
20
services/agent-regex-search-api/package.json
Normal file
20
services/agent-regex-search-api/package.json
Normal file
@ -0,0 +1,20 @@
|
||||
{
|
||||
"name": "@4nk/agent-regex-search-api",
|
||||
"version": "0.1.0",
|
||||
"private": true,
|
||||
"description": "Local HTTP API: ripgrep-backed regex search under REGEX_SEARCH_ROOT.",
|
||||
"license": "MIT",
|
||||
"type": "module",
|
||||
"main": "dist/server.js",
|
||||
"scripts": {
|
||||
"build": "tsc -p ./",
|
||||
"start": "node dist/server.js"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.11.0",
|
||||
"typescript": "^5.3.3"
|
||||
}
|
||||
}
|
||||
21
services/agent-regex-search-api/src/auth.ts
Normal file
21
services/agent-regex-search-api/src/auth.ts
Normal file
@ -0,0 +1,21 @@
|
||||
import type { IncomingMessage, ServerResponse } from "node:http";
|
||||
|
||||
export const readExpectedToken = (): string => {
|
||||
return process.env.REGEX_SEARCH_TOKEN?.trim() ?? "";
|
||||
};
|
||||
|
||||
export const requireBearer = (
|
||||
req: IncomingMessage,
|
||||
res: ServerResponse,
|
||||
expected: string,
|
||||
): boolean => {
|
||||
const h = req.headers.authorization ?? "";
|
||||
const match = /^Bearer\s+(.+)$/i.exec(h);
|
||||
const got = match?.[1]?.trim() ?? "";
|
||||
if (got !== expected) {
|
||||
res.writeHead(401, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify({ error: "Unauthorized" }));
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
};
|
||||
25
services/agent-regex-search-api/src/httpUtil.ts
Normal file
25
services/agent-regex-search-api/src/httpUtil.ts
Normal file
@ -0,0 +1,25 @@
|
||||
import type { IncomingMessage } from "node:http";
|
||||
|
||||
const MAX_BODY = 1_048_576;
|
||||
|
||||
export const readJsonBody = async (req: IncomingMessage): Promise<unknown> => {
|
||||
const chunks: Buffer[] = [];
|
||||
let total = 0;
|
||||
for await (const chunk of req) {
|
||||
const buf = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
|
||||
total += buf.length;
|
||||
if (total > MAX_BODY) {
|
||||
throw new Error("Request body too large");
|
||||
}
|
||||
chunks.push(buf);
|
||||
}
|
||||
const raw = Buffer.concat(chunks).toString("utf8").trim();
|
||||
if (raw.length === 0) {
|
||||
return {};
|
||||
}
|
||||
try {
|
||||
return JSON.parse(raw) as unknown;
|
||||
} catch (cause) {
|
||||
throw new Error("Invalid JSON body", { cause });
|
||||
}
|
||||
};
|
||||
49
services/agent-regex-search-api/src/paths.ts
Normal file
49
services/agent-regex-search-api/src/paths.ts
Normal file
@ -0,0 +1,49 @@
|
||||
import * as fs from "node:fs";
|
||||
import * as path from "node:path";
|
||||
|
||||
export const getAllowedRoot = (): string => {
|
||||
const raw = process.env.REGEX_SEARCH_ROOT?.trim();
|
||||
return path.resolve(raw && raw.length > 0 ? raw : "/home/ncantu/code");
|
||||
};
|
||||
|
||||
const ensureUnderRoot = (candidate: string, root: string): void => {
|
||||
const sep = path.sep;
|
||||
const prefix = root.endsWith(sep) ? root : root + sep;
|
||||
if (candidate !== root && !candidate.startsWith(prefix)) {
|
||||
throw new Error("Path escapes allowed root");
|
||||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* Resolves a path under REGEX_SEARCH_ROOT. `subpath` must be relative or empty (search entire root).
|
||||
*/
|
||||
export const resolveSearchTarget = (subpath: string | undefined): string => {
|
||||
const root = getAllowedRoot();
|
||||
let realRoot: string;
|
||||
try {
|
||||
realRoot = fs.realpathSync(root);
|
||||
} catch {
|
||||
throw new Error("REGEX_SEARCH_ROOT does not exist or is unreachable");
|
||||
}
|
||||
|
||||
if (subpath === undefined || subpath.trim() === "" || subpath === ".") {
|
||||
return realRoot;
|
||||
}
|
||||
const s = subpath.trim();
|
||||
if (path.isAbsolute(s)) {
|
||||
throw new Error("subpath must be relative to REGEX_SEARCH_ROOT");
|
||||
}
|
||||
if (s.includes("..")) {
|
||||
throw new Error("subpath must not contain '..'");
|
||||
}
|
||||
|
||||
const joined = path.resolve(realRoot, s);
|
||||
let realJoined: string;
|
||||
try {
|
||||
realJoined = fs.realpathSync(joined);
|
||||
} catch {
|
||||
throw new Error("subpath does not exist or is unreachable");
|
||||
}
|
||||
ensureUnderRoot(realJoined, realRoot);
|
||||
return realJoined;
|
||||
};
|
||||
185
services/agent-regex-search-api/src/rg.ts
Normal file
185
services/agent-regex-search-api/src/rg.ts
Normal file
@ -0,0 +1,185 @@
|
||||
import { spawn } from "node:child_process";
|
||||
|
||||
const MAX_PATTERN_LEN = 8192;
|
||||
|
||||
export interface RgMatchRow {
|
||||
path: string;
|
||||
lineNumber: number;
|
||||
line: string;
|
||||
}
|
||||
|
||||
export interface RgResult {
|
||||
matches: RgMatchRow[];
|
||||
truncated: boolean;
|
||||
exitCode: number;
|
||||
stderr: string;
|
||||
}
|
||||
|
||||
const isRecord = (v: unknown): v is Record<string, unknown> =>
|
||||
typeof v === "object" && v !== null && !Array.isArray(v);
|
||||
|
||||
const readPathText = (v: unknown): string => {
|
||||
if (!isRecord(v)) {
|
||||
return "";
|
||||
}
|
||||
const t = v.text;
|
||||
return typeof t === "string" ? t : "";
|
||||
};
|
||||
|
||||
const readLinesText = (v: unknown): string => {
|
||||
if (!isRecord(v)) {
|
||||
return "";
|
||||
}
|
||||
const t = v.lines;
|
||||
if (!isRecord(t)) {
|
||||
return "";
|
||||
}
|
||||
const text = t.text;
|
||||
return typeof text === "string" ? text : "";
|
||||
};
|
||||
|
||||
export const runRipgrepJson = (
|
||||
pattern: string,
|
||||
searchPath: string,
|
||||
maxMatches: number,
|
||||
timeoutMs: number,
|
||||
): Promise<RgResult> => {
|
||||
if (pattern.length === 0) {
|
||||
return Promise.resolve({
|
||||
matches: [],
|
||||
truncated: false,
|
||||
exitCode: 2,
|
||||
stderr: "Empty pattern",
|
||||
});
|
||||
}
|
||||
if (pattern.length > MAX_PATTERN_LEN) {
|
||||
return Promise.resolve({
|
||||
matches: [],
|
||||
truncated: false,
|
||||
exitCode: 2,
|
||||
stderr: "Pattern too long",
|
||||
});
|
||||
}
|
||||
|
||||
return new Promise((resolve) => {
|
||||
const matches: RgMatchRow[] = [];
|
||||
let truncated = false;
|
||||
let stderr = "";
|
||||
let settled = false;
|
||||
let timer: ReturnType<typeof setTimeout> | undefined;
|
||||
|
||||
const finish = (r: RgResult): void => {
|
||||
if (settled) {
|
||||
return;
|
||||
}
|
||||
settled = true;
|
||||
if (timer !== undefined) {
|
||||
clearTimeout(timer);
|
||||
}
|
||||
resolve(r);
|
||||
};
|
||||
|
||||
const child = spawn(
|
||||
"rg",
|
||||
[
|
||||
"--json",
|
||||
"--line-number",
|
||||
"--regexp",
|
||||
pattern,
|
||||
"--",
|
||||
searchPath,
|
||||
],
|
||||
{
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
windowsHide: true,
|
||||
},
|
||||
);
|
||||
|
||||
timer = setTimeout(() => {
|
||||
child.kill("SIGKILL");
|
||||
finish({
|
||||
matches,
|
||||
truncated: true,
|
||||
exitCode: 124,
|
||||
stderr: `${stderr}\nTimed out`.trim(),
|
||||
});
|
||||
}, timeoutMs);
|
||||
|
||||
const flushErr = (): void => {
|
||||
child.stderr?.on("data", (chunk: Buffer) => {
|
||||
stderr += chunk.toString("utf8");
|
||||
});
|
||||
};
|
||||
flushErr();
|
||||
|
||||
let buf = "";
|
||||
child.stdout?.setEncoding("utf8");
|
||||
child.stdout?.on("data", (chunk: string) => {
|
||||
buf += chunk;
|
||||
let idx: number;
|
||||
while ((idx = buf.indexOf("\n")) >= 0) {
|
||||
const line = buf.slice(0, idx);
|
||||
buf = buf.slice(idx + 1);
|
||||
if (line.length === 0) {
|
||||
continue;
|
||||
}
|
||||
let obj: unknown;
|
||||
try {
|
||||
obj = JSON.parse(line) as unknown;
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
if (!isRecord(obj) || obj.type !== "match") {
|
||||
continue;
|
||||
}
|
||||
const data = obj.data;
|
||||
if (!isRecord(data)) {
|
||||
continue;
|
||||
}
|
||||
const pathText = readPathText(data.path);
|
||||
const lineNum = data.line_number;
|
||||
const lineText = readLinesText(data).replace(/\n$/, "");
|
||||
if (typeof lineNum !== "number" || pathText.length === 0) {
|
||||
continue;
|
||||
}
|
||||
matches.push({
|
||||
path: pathText,
|
||||
lineNumber: lineNum,
|
||||
line: lineText,
|
||||
});
|
||||
if (matches.length >= maxMatches) {
|
||||
truncated = true;
|
||||
child.kill("SIGTERM");
|
||||
break;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
child.on("error", (err: NodeJS.ErrnoException) => {
|
||||
if (err.code === "ENOENT") {
|
||||
finish({
|
||||
matches: [],
|
||||
truncated: false,
|
||||
exitCode: 127,
|
||||
stderr: "ripgrep (rg) not found in PATH",
|
||||
});
|
||||
return;
|
||||
}
|
||||
finish({
|
||||
matches,
|
||||
truncated,
|
||||
exitCode: 1,
|
||||
stderr: `${stderr}\n${err.message}`.trim(),
|
||||
});
|
||||
});
|
||||
|
||||
child.on("close", (code) => {
|
||||
finish({
|
||||
matches,
|
||||
truncated,
|
||||
exitCode: code ?? 0,
|
||||
stderr: stderr.trim(),
|
||||
});
|
||||
});
|
||||
});
|
||||
};
|
||||
111
services/agent-regex-search-api/src/server.ts
Normal file
111
services/agent-regex-search-api/src/server.ts
Normal file
@ -0,0 +1,111 @@
|
||||
import * as http from "node:http";
|
||||
import { readExpectedToken, requireBearer } from "./auth.js";
|
||||
import { readJsonBody } from "./httpUtil.js";
|
||||
import { getAllowedRoot, resolveSearchTarget } from "./paths.js";
|
||||
import { runRipgrepJson } from "./rg.js";
|
||||
|
||||
const HOST = process.env.REGEX_SEARCH_HOST ?? "127.0.0.1";
|
||||
const PORT = Number(process.env.REGEX_SEARCH_PORT ?? "37143");
|
||||
const DEFAULT_MAX = 500;
|
||||
const DEFAULT_TIMEOUT_MS = 60_000;
|
||||
|
||||
const isRecord = (v: unknown): v is Record<string, unknown> =>
|
||||
typeof v === "object" && v !== null && !Array.isArray(v);
|
||||
|
||||
const json = (
|
||||
res: http.ServerResponse,
|
||||
status: number,
|
||||
body: unknown,
|
||||
): void => {
|
||||
res.writeHead(status, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify(body));
|
||||
};
|
||||
|
||||
const main = (): void => {
|
||||
const token = readExpectedToken();
|
||||
if (token.length === 0) {
|
||||
console.error("agent-regex-search-api: set REGEX_SEARCH_TOKEN (non-empty secret).");
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const server = http.createServer((req, res) => {
|
||||
void (async () => {
|
||||
try {
|
||||
if (req.method === "GET" && (req.url === "/health" || req.url === "/health/")) {
|
||||
json(res, 200, { status: "ok", root: getAllowedRoot() });
|
||||
return;
|
||||
}
|
||||
if (!requireBearer(req, res, token)) {
|
||||
return;
|
||||
}
|
||||
if (req.method !== "POST" || req.url !== "/search") {
|
||||
json(res, 404, { error: "Not found" });
|
||||
return;
|
||||
}
|
||||
|
||||
const body = await readJsonBody(req);
|
||||
if (!isRecord(body)) {
|
||||
json(res, 400, { error: "Expected JSON object" });
|
||||
return;
|
||||
}
|
||||
const pattern = body.pattern;
|
||||
if (typeof pattern !== "string") {
|
||||
json(res, 400, { error: "Missing pattern (string)" });
|
||||
return;
|
||||
}
|
||||
const subpath =
|
||||
typeof body.subpath === "string" ? body.subpath : undefined;
|
||||
const maxRaw = body.maxMatches;
|
||||
const maxMatches =
|
||||
typeof maxRaw === "number" && Number.isFinite(maxRaw) && maxRaw > 0
|
||||
? Math.min(Math.floor(maxRaw), 50_000)
|
||||
: DEFAULT_MAX;
|
||||
const timeoutRaw = body.timeoutMs;
|
||||
const timeoutMs =
|
||||
typeof timeoutRaw === "number" && Number.isFinite(timeoutRaw) && timeoutRaw > 0
|
||||
? Math.min(Math.floor(timeoutRaw), 300_000)
|
||||
: DEFAULT_TIMEOUT_MS;
|
||||
|
||||
const target = resolveSearchTarget(subpath);
|
||||
const out = await runRipgrepJson(pattern, target, maxMatches, timeoutMs);
|
||||
|
||||
if (out.exitCode === 127) {
|
||||
json(res, 503, {
|
||||
error: out.stderr,
|
||||
matches: [],
|
||||
truncated: false,
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (out.exitCode === 2) {
|
||||
json(res, 400, {
|
||||
error: out.stderr || "ripgrep error (invalid pattern or IO)",
|
||||
matches: out.matches,
|
||||
truncated: out.truncated,
|
||||
exitCode: out.exitCode,
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
json(res, 200, {
|
||||
root: getAllowedRoot(),
|
||||
target,
|
||||
matches: out.matches,
|
||||
truncated: out.truncated,
|
||||
exitCode: out.exitCode,
|
||||
});
|
||||
} catch (e) {
|
||||
const msg = e instanceof Error ? e.message : String(e);
|
||||
json(res, 400, { error: msg });
|
||||
}
|
||||
})();
|
||||
});
|
||||
|
||||
server.listen(PORT, HOST, () => {
|
||||
console.error(
|
||||
`agent-regex-search-api listening on http://${HOST}:${PORT} (root=${getAllowedRoot()})`,
|
||||
);
|
||||
});
|
||||
};
|
||||
|
||||
main();
|
||||
16
services/agent-regex-search-api/tsconfig.json
Normal file
16
services/agent-regex-search-api/tsconfig.json
Normal file
@ -0,0 +1,16 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2022",
|
||||
"module": "NodeNext",
|
||||
"moduleResolution": "NodeNext",
|
||||
"outDir": "dist",
|
||||
"rootDir": "src",
|
||||
"strict": true,
|
||||
"skipLibCheck": true,
|
||||
"noImplicitReturns": true,
|
||||
"noUnusedLocals": true,
|
||||
"noUnusedParameters": true,
|
||||
"declaration": false
|
||||
},
|
||||
"include": ["src/**/*.ts"]
|
||||
}
|
||||
48
services/claw-harness-api/README.md
Normal file
48
services/claw-harness-api/README.md
Normal file
@ -0,0 +1,48 @@
|
||||
# claw-harness-api
|
||||
|
||||
Integration notes and a **thin local proxy** for the **claw-code** harness (multi-model agent runtime). Upstream sources:
|
||||
|
||||
- Mirror listing: [gitlawb — claw-code](https://gitlawb.com/node/repos/z6Mks1jg/claw-code)
|
||||
- GitHub (often used for clone): [instructkr/claw-code](https://github.com/instructkr/claw-code)
|
||||
|
||||
This folder does **not** vendor claw-code. Clone upstream next to this repo or under a path you control, then build and run according to upstream `README.md` (Rust workspace under `rust/` with `cargo build --release`, and/or Python `src/` tooling depending on branch).
|
||||
|
||||
## Policy: no Anthropic in templates
|
||||
|
||||
The file [`providers.example.yaml`](./providers.example.yaml) lists **Ollama** and optional OpenAI-compatible / Gemini-style placeholders. **Anthropic is set to `enabled: false`.** Operational enforcement (firewall, absent `ANTHROPIC_API_KEY`, etc.) remains your responsibility on the host.
|
||||
|
||||
## Upstream build (summary)
|
||||
|
||||
```bash
|
||||
git clone https://github.com/instructkr/claw-code.git
|
||||
cd claw-code/rust
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
Exact binaries, subcommands, and HTTP server flags depend on the cloned revision; read upstream `README.md` and `rust/crates/*/README` if present.
|
||||
|
||||
## Local proxy (`proxy/`)
|
||||
|
||||
To align with other smart_ide services (Bearer token, fixed bind address), a small Node proxy can forward HTTP to the upstream claw HTTP server.
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `CLAW_PROXY_TOKEN` | yes | `Authorization: Bearer <token>` on client calls to the proxy. |
|
||||
| `CLAW_PROXY_HOST` | no | Bind address (default `127.0.0.1`). |
|
||||
| `CLAW_PROXY_PORT` | no | Proxy listen port (default `37142`). |
|
||||
| `CLAW_UPSTREAM_URL` | yes | Base URL of the claw HTTP server (e.g. `http://127.0.0.1:37143`). |
|
||||
|
||||
```bash
|
||||
cd proxy
|
||||
npm install
|
||||
npm run build
|
||||
export CLAW_PROXY_TOKEN='…'
|
||||
export CLAW_UPSTREAM_URL='http://127.0.0.1:37143'
|
||||
npm start
|
||||
```
|
||||
|
||||
The proxy forwards method, path, query, and body; it does not modify Anthropic or other provider traffic beyond what the upstream server already does.
|
||||
|
||||
## License
|
||||
|
||||
Files in this directory (README, YAML example, proxy) are MIT unless noted. claw-code is a third-party project with its own license.
|
||||
22
services/claw-harness-api/providers.example.yaml
Normal file
22
services/claw-harness-api/providers.example.yaml
Normal file
@ -0,0 +1,22 @@
|
||||
# Example provider policy for smart_ide + claw-code upstream.
|
||||
# Copy to your claw runtime config location after reading upstream docs.
|
||||
# Anthropic is intentionally omitted: no api.anthropic.com, no Claude API keys in this template.
|
||||
|
||||
providers:
|
||||
ollama:
|
||||
enabled: true
|
||||
base_url: "http://127.0.0.1:11434"
|
||||
# default_model: set per your pulled tags (e.g. qwen2.5, gemma2)
|
||||
|
||||
openai_compatible:
|
||||
enabled: false
|
||||
base_url: "http://127.0.0.1:8080/v1"
|
||||
# api_key: set via secret manager or env referenced by upstream claw config
|
||||
|
||||
google_gemini:
|
||||
enabled: false
|
||||
# Use upstream env / config names for API keys; do not commit secrets.
|
||||
|
||||
anthropic:
|
||||
enabled: false
|
||||
# Explicitly disabled in this repository template.
|
||||
2
services/claw-harness-api/proxy/.gitignore
vendored
Normal file
2
services/claw-harness-api/proxy/.gitignore
vendored
Normal file
@ -0,0 +1,2 @@
|
||||
dist/
|
||||
node_modules/
|
||||
51
services/claw-harness-api/proxy/package-lock.json
generated
Normal file
51
services/claw-harness-api/proxy/package-lock.json
generated
Normal file
@ -0,0 +1,51 @@
|
||||
{
|
||||
"name": "@4nk/claw-harness-proxy",
|
||||
"version": "0.1.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "@4nk/claw-harness-proxy",
|
||||
"version": "0.1.0",
|
||||
"license": "MIT",
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.11.0",
|
||||
"typescript": "^5.3.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20"
|
||||
}
|
||||
},
|
||||
"node_modules/@types/node": {
|
||||
"version": "20.19.39",
|
||||
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.39.tgz",
|
||||
"integrity": "sha512-orrrD74MBUyK8jOAD/r0+lfa1I2MO6I+vAkmAWzMYbCcgrN4lCrmK52gRFQq/JRxfYPfonkr4b0jcY7Olqdqbw==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"undici-types": "~6.21.0"
|
||||
}
|
||||
},
|
||||
"node_modules/typescript": {
|
||||
"version": "5.9.3",
|
||||
"resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz",
|
||||
"integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"bin": {
|
||||
"tsc": "bin/tsc",
|
||||
"tsserver": "bin/tsserver"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=14.17"
|
||||
}
|
||||
},
|
||||
"node_modules/undici-types": {
|
||||
"version": "6.21.0",
|
||||
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
|
||||
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
}
|
||||
}
|
||||
}
|
||||
20
services/claw-harness-api/proxy/package.json
Normal file
20
services/claw-harness-api/proxy/package.json
Normal file
@ -0,0 +1,20 @@
|
||||
{
|
||||
"name": "@4nk/claw-harness-proxy",
|
||||
"version": "0.1.0",
|
||||
"private": true,
|
||||
"description": "Bearer-gated HTTP forwarder to upstream claw-code HTTP server.",
|
||||
"license": "MIT",
|
||||
"type": "module",
|
||||
"main": "dist/server.js",
|
||||
"scripts": {
|
||||
"build": "tsc -p ./",
|
||||
"start": "node dist/server.js"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.11.0",
|
||||
"typescript": "^5.3.3"
|
||||
}
|
||||
}
|
||||
21
services/claw-harness-api/proxy/src/auth.ts
Normal file
21
services/claw-harness-api/proxy/src/auth.ts
Normal file
@ -0,0 +1,21 @@
|
||||
import type { IncomingMessage, ServerResponse } from "node:http";
|
||||
|
||||
export const readExpectedToken = (): string => {
|
||||
return process.env.CLAW_PROXY_TOKEN?.trim() ?? "";
|
||||
};
|
||||
|
||||
export const requireBearer = (
|
||||
req: IncomingMessage,
|
||||
res: ServerResponse,
|
||||
expected: string,
|
||||
): boolean => {
|
||||
const h = req.headers.authorization ?? "";
|
||||
const match = /^Bearer\s+(.+)$/i.exec(h);
|
||||
const got = match?.[1]?.trim() ?? "";
|
||||
if (got !== expected) {
|
||||
res.writeHead(401, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify({ error: "Unauthorized" }));
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
};
|
||||
120
services/claw-harness-api/proxy/src/server.ts
Normal file
120
services/claw-harness-api/proxy/src/server.ts
Normal file
@ -0,0 +1,120 @@
|
||||
import * as http from "node:http";
|
||||
import * as https from "node:https";
|
||||
import { URL } from "node:url";
|
||||
import { readExpectedToken, requireBearer } from "./auth.js";
|
||||
|
||||
const HOP_BY_HOP = new Set([
|
||||
"connection",
|
||||
"keep-alive",
|
||||
"proxy-authenticate",
|
||||
"proxy-authorization",
|
||||
"te",
|
||||
"trailers",
|
||||
"transfer-encoding",
|
||||
"upgrade",
|
||||
]);
|
||||
|
||||
const readUpstreamBase = (): URL => {
|
||||
const raw = process.env.CLAW_UPSTREAM_URL?.trim() ?? "";
|
||||
if (raw.length === 0) {
|
||||
throw new Error("CLAW_UPSTREAM_URL is required");
|
||||
}
|
||||
return new URL(raw.endsWith("/") ? raw.slice(0, -1) : raw);
|
||||
};
|
||||
|
||||
const forwardHeaders = (
|
||||
req: http.IncomingMessage,
|
||||
): http.OutgoingHttpHeaders => {
|
||||
const out: http.OutgoingHttpHeaders = {};
|
||||
for (const [k, v] of Object.entries(req.headers)) {
|
||||
if (v === undefined) {
|
||||
continue;
|
||||
}
|
||||
const key = k.toLowerCase();
|
||||
if (HOP_BY_HOP.has(key) || key === "host") {
|
||||
continue;
|
||||
}
|
||||
out[k] = v;
|
||||
}
|
||||
return out;
|
||||
};
|
||||
|
||||
const HOST = process.env.CLAW_PROXY_HOST ?? "127.0.0.1";
|
||||
const PORT = Number(process.env.CLAW_PROXY_PORT ?? "37142");
|
||||
|
||||
const main = (): void => {
|
||||
const token = readExpectedToken();
|
||||
if (token.length === 0) {
|
||||
console.error("claw-harness-proxy: set CLAW_PROXY_TOKEN (non-empty secret).");
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
let upstreamBase: URL;
|
||||
try {
|
||||
upstreamBase = readUpstreamBase();
|
||||
} catch (e) {
|
||||
const msg = e instanceof Error ? e.message : String(e);
|
||||
console.error(`claw-harness-proxy: ${msg}`);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const server = http.createServer((req, res) => {
|
||||
void (async () => {
|
||||
try {
|
||||
if (req.method === "GET" && (req.url === "/health" || req.url === "/health/")) {
|
||||
res.writeHead(200, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify({ status: "ok" }));
|
||||
return;
|
||||
}
|
||||
if (!requireBearer(req, res, token)) {
|
||||
return;
|
||||
}
|
||||
|
||||
const urlPath = req.url ?? "/";
|
||||
const target = new URL(urlPath, `${upstreamBase.origin}/`);
|
||||
|
||||
const isHttps = target.protocol === "https:";
|
||||
const lib = isHttps ? https : http;
|
||||
const defaultPort = isHttps ? 443 : 80;
|
||||
const port =
|
||||
target.port !== "" ? Number(target.port) : defaultPort;
|
||||
|
||||
const headers = forwardHeaders(req);
|
||||
|
||||
const preq = lib.request(
|
||||
{
|
||||
hostname: target.hostname,
|
||||
port,
|
||||
path: `${target.pathname}${target.search}`,
|
||||
method: req.method,
|
||||
headers,
|
||||
},
|
||||
(pres) => {
|
||||
const ph = { ...pres.headers };
|
||||
res.writeHead(pres.statusCode ?? 502, ph);
|
||||
pres.pipe(res);
|
||||
},
|
||||
);
|
||||
|
||||
preq.on("error", (err) => {
|
||||
res.writeHead(502, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify({ error: err.message }));
|
||||
});
|
||||
|
||||
req.pipe(preq);
|
||||
} catch (e) {
|
||||
const msg = e instanceof Error ? e.message : String(e);
|
||||
res.writeHead(400, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify({ error: msg }));
|
||||
}
|
||||
})();
|
||||
});
|
||||
|
||||
server.listen(PORT, HOST, () => {
|
||||
console.error(
|
||||
`claw-harness-proxy listening on http://${HOST}:${PORT} -> ${upstreamBase.origin}`,
|
||||
);
|
||||
});
|
||||
};
|
||||
|
||||
main();
|
||||
16
services/claw-harness-api/proxy/tsconfig.json
Normal file
16
services/claw-harness-api/proxy/tsconfig.json
Normal file
@ -0,0 +1,16 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2022",
|
||||
"module": "NodeNext",
|
||||
"moduleResolution": "NodeNext",
|
||||
"outDir": "dist",
|
||||
"rootDir": "src",
|
||||
"strict": true,
|
||||
"skipLibCheck": true,
|
||||
"noImplicitReturns": true,
|
||||
"noUnusedLocals": true,
|
||||
"noUnusedParameters": true,
|
||||
"declaration": false
|
||||
},
|
||||
"include": ["src/**/*.ts"]
|
||||
}
|
||||
2
services/ia-dev-gateway/.gitignore
vendored
Normal file
2
services/ia-dev-gateway/.gitignore
vendored
Normal file
@ -0,0 +1,2 @@
|
||||
node_modules/
|
||||
dist/
|
||||
23
services/ia-dev-gateway/README.md
Normal file
23
services/ia-dev-gateway/README.md
Normal file
@ -0,0 +1,23 @@
|
||||
# ia-dev-gateway
|
||||
|
||||
HTTP API for the **ia_dev** checkout: lists agents from `.cursor/agents/*.md`, accepts `POST /v1/runs` (stub completion), SSE on `/v1/runs/:id/events`. Wire to real deploy/agent scripts later.
|
||||
|
||||
## Build / run
|
||||
|
||||
```bash
|
||||
npm install
|
||||
npm run build
|
||||
export IA_DEV_GATEWAY_TOKEN='your-secret'
|
||||
# optional: IA_DEV_ROOT=/path/to/ia_dev (default: ../../ia_dev from monorepo)
|
||||
npm start
|
||||
```
|
||||
|
||||
Default bind: `127.0.0.1:37144`.
|
||||
|
||||
## Contract
|
||||
|
||||
See [docs/API/ia-dev-gateway.md](../../docs/API/ia-dev-gateway.md) and [docs/features/ia-dev-service.md](../../docs/features/ia-dev-service.md).
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
51
services/ia-dev-gateway/package-lock.json
generated
Normal file
51
services/ia-dev-gateway/package-lock.json
generated
Normal file
@ -0,0 +1,51 @@
|
||||
{
|
||||
"name": "@4nk/ia-dev-gateway",
|
||||
"version": "0.1.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "@4nk/ia-dev-gateway",
|
||||
"version": "0.1.0",
|
||||
"license": "MIT",
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.11.0",
|
||||
"typescript": "^5.3.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20"
|
||||
}
|
||||
},
|
||||
"node_modules/@types/node": {
|
||||
"version": "20.19.39",
|
||||
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.39.tgz",
|
||||
"integrity": "sha512-orrrD74MBUyK8jOAD/r0+lfa1I2MO6I+vAkmAWzMYbCcgrN4lCrmK52gRFQq/JRxfYPfonkr4b0jcY7Olqdqbw==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"undici-types": "~6.21.0"
|
||||
}
|
||||
},
|
||||
"node_modules/typescript": {
|
||||
"version": "5.9.3",
|
||||
"resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz",
|
||||
"integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"bin": {
|
||||
"tsc": "bin/tsc",
|
||||
"tsserver": "bin/tsserver"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=14.17"
|
||||
}
|
||||
},
|
||||
"node_modules/undici-types": {
|
||||
"version": "6.21.0",
|
||||
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
|
||||
"integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
}
|
||||
}
|
||||
}
|
||||
20
services/ia-dev-gateway/package.json
Normal file
20
services/ia-dev-gateway/package.json
Normal file
@ -0,0 +1,20 @@
|
||||
{
|
||||
"name": "@4nk/ia-dev-gateway",
|
||||
"version": "0.1.0",
|
||||
"private": true,
|
||||
"description": "HTTP API for ia_dev: agent registry scan, runs, SSE (stub runner).",
|
||||
"license": "MIT",
|
||||
"type": "module",
|
||||
"main": "dist/server.js",
|
||||
"scripts": {
|
||||
"build": "tsc -p ./",
|
||||
"start": "node dist/server.js"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.11.0",
|
||||
"typescript": "^5.3.3"
|
||||
}
|
||||
}
|
||||
21
services/ia-dev-gateway/src/auth.ts
Normal file
21
services/ia-dev-gateway/src/auth.ts
Normal file
@ -0,0 +1,21 @@
|
||||
import type { IncomingMessage, ServerResponse } from "node:http";
|
||||
|
||||
export const readExpectedToken = (): string => {
|
||||
return process.env.IA_DEV_GATEWAY_TOKEN?.trim() ?? "";
|
||||
};
|
||||
|
||||
export const requireBearer = (
|
||||
req: IncomingMessage,
|
||||
res: ServerResponse,
|
||||
expected: string,
|
||||
): boolean => {
|
||||
const h = req.headers.authorization ?? "";
|
||||
const match = /^Bearer\s+(.+)$/i.exec(h);
|
||||
const got = match?.[1]?.trim() ?? "";
|
||||
if (got !== expected) {
|
||||
res.writeHead(401, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify({ error: "Unauthorized" }));
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
};
|
||||
25
services/ia-dev-gateway/src/httpUtil.ts
Normal file
25
services/ia-dev-gateway/src/httpUtil.ts
Normal file
@ -0,0 +1,25 @@
|
||||
import type { IncomingMessage } from "node:http";
|
||||
|
||||
const MAX_BODY = 1_048_576;
|
||||
|
||||
export const readJsonBody = async (req: IncomingMessage): Promise<unknown> => {
|
||||
const chunks: Buffer[] = [];
|
||||
let total = 0;
|
||||
for await (const chunk of req) {
|
||||
const buf = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
|
||||
total += buf.length;
|
||||
if (total > MAX_BODY) {
|
||||
throw new Error("Request body too large");
|
||||
}
|
||||
chunks.push(buf);
|
||||
}
|
||||
const raw = Buffer.concat(chunks).toString("utf8").trim();
|
||||
if (raw.length === 0) {
|
||||
return {};
|
||||
}
|
||||
try {
|
||||
return JSON.parse(raw) as unknown;
|
||||
} catch (cause) {
|
||||
throw new Error("Invalid JSON body", { cause });
|
||||
}
|
||||
};
|
||||
28
services/ia-dev-gateway/src/paths.ts
Normal file
28
services/ia-dev-gateway/src/paths.ts
Normal file
@ -0,0 +1,28 @@
|
||||
import * as fs from "node:fs";
|
||||
import * as path from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const __dirname = path.dirname(fileURLToPath(import.meta.url));
|
||||
|
||||
/** Path to ia_dev checkout: IA_DEV_ROOT env or monorepo ./ia_dev */
|
||||
export const getIaDevRoot = (): string => {
|
||||
const fromEnv = process.env.IA_DEV_ROOT?.trim();
|
||||
if (fromEnv && fromEnv.length > 0) {
|
||||
return path.resolve(fromEnv);
|
||||
}
|
||||
return path.resolve(__dirname, "..", "..", "..", "ia_dev");
|
||||
};
|
||||
|
||||
export const agentsDir = (iaDevRoot: string): string =>
|
||||
path.join(iaDevRoot, ".cursor", "agents");
|
||||
|
||||
export const projectDir = (iaDevRoot: string, projectId: string): string =>
|
||||
path.join(iaDevRoot, "projects", projectId);
|
||||
|
||||
export const dirExists = (p: string): boolean => {
|
||||
try {
|
||||
return fs.statSync(p).isDirectory();
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
};
|
||||
216
services/ia-dev-gateway/src/server.ts
Normal file
216
services/ia-dev-gateway/src/server.ts
Normal file
@ -0,0 +1,216 @@
|
||||
import * as crypto from "node:crypto";
|
||||
import * as http from "node:http";
|
||||
import * as fs from "node:fs";
|
||||
import { readExpectedToken, requireBearer } from "./auth.js";
|
||||
import { readJsonBody } from "./httpUtil.js";
|
||||
import { agentsDir, dirExists, getIaDevRoot, projectDir } from "./paths.js";
|
||||
|
||||
const HOST = process.env.IA_DEV_GATEWAY_HOST ?? "127.0.0.1";
|
||||
const PORT = Number(process.env.IA_DEV_GATEWAY_PORT ?? "37144");
|
||||
|
||||
type RunRecord = {
|
||||
runId: string;
|
||||
status: "queued" | "running" | "completed" | "failed";
|
||||
agentId: string;
|
||||
projectId: string;
|
||||
intent: string;
|
||||
startedAt: string;
|
||||
finishedAt?: string;
|
||||
exitCode?: number;
|
||||
summary?: string;
|
||||
error?: string;
|
||||
};
|
||||
|
||||
const runs = new Map<string, RunRecord>();
|
||||
|
||||
const json = (res: http.ServerResponse, status: number, body: unknown): void => {
|
||||
res.writeHead(status, { "Content-Type": "application/json; charset=utf-8" });
|
||||
res.end(JSON.stringify(body));
|
||||
};
|
||||
|
||||
const isRecord = (v: unknown): v is Record<string, unknown> =>
|
||||
typeof v === "object" && v !== null && !Array.isArray(v);
|
||||
|
||||
const listAgents = (): { id: string; name: string; summary: string; triggerCommands: string[] }[] => {
|
||||
const root = getIaDevRoot();
|
||||
const dir = agentsDir(root);
|
||||
if (!dirExists(dir)) {
|
||||
return [];
|
||||
}
|
||||
const out: { id: string; name: string; summary: string; triggerCommands: string[] }[] = [];
|
||||
for (const ent of fs.readdirSync(dir, { withFileTypes: true })) {
|
||||
if (!ent.isFile() || !ent.name.endsWith(".md")) {
|
||||
continue;
|
||||
}
|
||||
const id = ent.name.replace(/\.md$/i, "");
|
||||
out.push({
|
||||
id,
|
||||
name: id,
|
||||
summary: `Agent definition ${ent.name}`,
|
||||
triggerCommands: [],
|
||||
});
|
||||
}
|
||||
out.sort((a, b) => a.id.localeCompare(b.id));
|
||||
return out;
|
||||
};
|
||||
|
||||
const agentDescriptor = (id: string): Record<string, unknown> | null => {
|
||||
const agents = listAgents();
|
||||
const found = agents.find((a) => a.id === id);
|
||||
if (!found) {
|
||||
return null;
|
||||
}
|
||||
return {
|
||||
id: found.id,
|
||||
name: found.name,
|
||||
role: "agent",
|
||||
inputs: {},
|
||||
outputs: {},
|
||||
rights: [],
|
||||
dependencies: [],
|
||||
scripts: [],
|
||||
risk: "unknown",
|
||||
compatibleEnvs: ["test", "pprod", "prod"],
|
||||
};
|
||||
};
|
||||
|
||||
const main = (): void => {
|
||||
const token = readExpectedToken();
|
||||
if (token.length === 0) {
|
||||
console.error("ia-dev-gateway: set IA_DEV_GATEWAY_TOKEN (non-empty secret).");
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const server = http.createServer((req, res) => {
|
||||
void (async () => {
|
||||
try {
|
||||
const url = new URL(req.url ?? "/", `http://${HOST}`);
|
||||
const p = url.pathname;
|
||||
|
||||
if (req.method === "GET" && (p === "/health" || p === "/health/")) {
|
||||
json(res, 200, { status: "ok" });
|
||||
return;
|
||||
}
|
||||
|
||||
if (!requireBearer(req, res, token)) {
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && p === "/v1/agents") {
|
||||
json(res, 200, { agents: listAgents() });
|
||||
return;
|
||||
}
|
||||
|
||||
const agentMatch = /^\/v1\/agents\/([^/]+)\/?$/.exec(p);
|
||||
if (req.method === "GET" && agentMatch) {
|
||||
const desc = agentDescriptor(agentMatch[1]);
|
||||
if (!desc) {
|
||||
json(res, 404, { error: "Agent not found" });
|
||||
return;
|
||||
}
|
||||
json(res, 200, desc);
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "POST" && p === "/v1/runs") {
|
||||
const body = await readJsonBody(req);
|
||||
if (!isRecord(body)) {
|
||||
json(res, 422, { error: "Expected JSON object" });
|
||||
return;
|
||||
}
|
||||
const agentId = body.agentId;
|
||||
const projectId = body.projectId;
|
||||
const intent = body.intent;
|
||||
if (typeof agentId !== "string" || agentId.length === 0) {
|
||||
json(res, 422, { error: "Missing agentId" });
|
||||
return;
|
||||
}
|
||||
if (typeof projectId !== "string" || projectId.length === 0) {
|
||||
json(res, 422, { error: "Missing projectId" });
|
||||
return;
|
||||
}
|
||||
if (typeof intent !== "string" || intent.length === 0) {
|
||||
json(res, 422, { error: "Missing intent" });
|
||||
return;
|
||||
}
|
||||
const iaRoot = getIaDevRoot();
|
||||
if (!dirExists(projectDir(iaRoot, projectId))) {
|
||||
json(res, 403, { error: "Project not found under IA_DEV_ROOT", projectId });
|
||||
return;
|
||||
}
|
||||
const runId = crypto.randomUUID();
|
||||
const startedAt = new Date().toISOString();
|
||||
const rec: RunRecord = {
|
||||
runId,
|
||||
status: "queued",
|
||||
agentId,
|
||||
projectId,
|
||||
intent,
|
||||
startedAt,
|
||||
summary: "Stub: runner not wired to ia_dev scripts",
|
||||
};
|
||||
runs.set(runId, rec);
|
||||
rec.status = "completed";
|
||||
rec.finishedAt = new Date().toISOString();
|
||||
rec.exitCode = 0;
|
||||
json(res, 200, { runId, status: rec.status });
|
||||
return;
|
||||
}
|
||||
|
||||
const runGet = /^\/v1\/runs\/([^/]+)\/?$/.exec(p);
|
||||
if (req.method === "GET" && runGet) {
|
||||
const r = runs.get(runGet[1]);
|
||||
if (!r) {
|
||||
json(res, 404, { error: "Run not found" });
|
||||
return;
|
||||
}
|
||||
json(res, 200, {
|
||||
runId: r.runId,
|
||||
status: r.status,
|
||||
startedAt: r.startedAt,
|
||||
finishedAt: r.finishedAt,
|
||||
exitCode: r.exitCode,
|
||||
summary: r.summary,
|
||||
error: r.error,
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
const runEvents = /^\/v1\/runs\/([^/]+)\/events\/?$/.exec(p);
|
||||
if (req.method === "GET" && runEvents) {
|
||||
const r = runs.get(runEvents[1]);
|
||||
if (!r) {
|
||||
json(res, 404, { error: "Run not found" });
|
||||
return;
|
||||
}
|
||||
res.writeHead(200, {
|
||||
"Content-Type": "text/event-stream; charset=utf-8",
|
||||
"Cache-Control": "no-cache",
|
||||
Connection: "keep-alive",
|
||||
});
|
||||
const send = (data: object): void => {
|
||||
res.write(`data: ${JSON.stringify(data)}\n\n`);
|
||||
};
|
||||
send({ type: "started", runId: r.runId });
|
||||
send({ type: "completed", runId: r.runId, exitCode: r.exitCode ?? 0 });
|
||||
res.end();
|
||||
return;
|
||||
}
|
||||
|
||||
json(res, 404, { error: "Not found" });
|
||||
} catch (e) {
|
||||
const msg = e instanceof Error ? e.message : String(e);
|
||||
json(res, 400, { error: msg });
|
||||
}
|
||||
})();
|
||||
});
|
||||
|
||||
server.listen(PORT, HOST, () => {
|
||||
const ia = getIaDevRoot();
|
||||
console.error(
|
||||
`ia-dev-gateway listening on http://${HOST}:${PORT} (IA_DEV_ROOT=${ia})`,
|
||||
);
|
||||
});
|
||||
};
|
||||
|
||||
main();
|
||||
16
services/ia-dev-gateway/tsconfig.json
Normal file
16
services/ia-dev-gateway/tsconfig.json
Normal file
@ -0,0 +1,16 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2022",
|
||||
"module": "NodeNext",
|
||||
"moduleResolution": "NodeNext",
|
||||
"outDir": "dist",
|
||||
"rootDir": "src",
|
||||
"strict": true,
|
||||
"skipLibCheck": true,
|
||||
"noImplicitReturns": true,
|
||||
"noUnusedLocals": true,
|
||||
"noUnusedParameters": true,
|
||||
"declaration": false
|
||||
},
|
||||
"include": ["src/**/*.ts"]
|
||||
}
|
||||
5
services/langextract-api/.gitignore
vendored
Normal file
5
services/langextract-api/.gitignore
vendored
Normal file
@ -0,0 +1,5 @@
|
||||
.venv/
|
||||
__pycache__/
|
||||
*.egg-info/
|
||||
dist/
|
||||
build/
|
||||
48
services/langextract-api/README.md
Normal file
48
services/langextract-api/README.md
Normal file
@ -0,0 +1,48 @@
|
||||
# langextract-api
|
||||
|
||||
Local HTTP API on **`127.0.0.1`** wrapping [google/langextract](https://github.com/google/langextract): structured extractions from unstructured text with optional character grounding.
|
||||
|
||||
## Environment
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `LANGEXTRACT_SERVICE_TOKEN` | no | If set, every request must send `Authorization: Bearer <token>`. |
|
||||
| `LANGEXTRACT_API_HOST` | no | Bind address (default `127.0.0.1`). |
|
||||
| `LANGEXTRACT_API_PORT` | no | Port (default `37141`). |
|
||||
| `LANGEXTRACT_API_KEY` | no | Used by LangExtract for cloud models (e.g. Gemini) when the client does not pass `api_key` in the JSON body. See upstream docs. |
|
||||
|
||||
## Endpoints
|
||||
|
||||
- `GET /health` — liveness.
|
||||
- `POST /extract` — run extraction. JSON body matches [LangExtract](https://github.com/google/langextract) `extract()` parameters where applicable: `text`, `prompt_description`, `examples`, `model_id`, optional `model_url` (Ollama), `extraction_passes`, `max_workers`, `max_char_buffer`, `api_key`, `fence_output`, `use_schema_constraints`.
|
||||
|
||||
Example `examples` item:
|
||||
|
||||
```json
|
||||
{
|
||||
"text": "ROMEO. But soft!",
|
||||
"extractions": [
|
||||
{
|
||||
"extraction_class": "character",
|
||||
"extraction_text": "ROMEO",
|
||||
"attributes": {}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
export LANGEXTRACT_SERVICE_TOKEN='…'
|
||||
uvicorn app.main:app --host "${LANGEXTRACT_API_HOST:-127.0.0.1}" --port "${LANGEXTRACT_API_PORT:-37141}"
|
||||
```
|
||||
|
||||
For Ollama-backed models, set `model_id` to your tag (e.g. `gemma2:2b`), `model_url` to `http://127.0.0.1:11434`, and typically `fence_output: false`, `use_schema_constraints: false` per upstream README.
|
||||
|
||||
## License
|
||||
|
||||
This wrapper is MIT. LangExtract is Apache-2.0 (see upstream repository).
|
||||
0
services/langextract-api/app/__init__.py
Normal file
0
services/langextract-api/app/__init__.py
Normal file
149
services/langextract-api/app/main.py
Normal file
149
services/langextract-api/app/main.py
Normal file
@ -0,0 +1,149 @@
|
||||
"""Local LangExtract HTTP API."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from typing import Any
|
||||
|
||||
import langextract as lx
|
||||
from fastapi import Depends, FastAPI, HTTPException
|
||||
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
app = FastAPI(title="langextract-api", version="0.1.0")
|
||||
_bearer = HTTPBearer(auto_error=False)
|
||||
|
||||
|
||||
def _expected_service_token() -> str:
|
||||
return os.environ.get("LANGEXTRACT_SERVICE_TOKEN", "").strip()
|
||||
|
||||
|
||||
def verify_service_token(
|
||||
creds: HTTPAuthorizationCredentials | None = Depends(_bearer),
|
||||
) -> None:
|
||||
expected = _expected_service_token()
|
||||
if not expected:
|
||||
return
|
||||
if creds is None:
|
||||
raise HTTPException(status_code=401, detail="Unauthorized")
|
||||
token = creds.credentials.strip()
|
||||
if token != expected:
|
||||
raise HTTPException(status_code=401, detail="Unauthorized")
|
||||
|
||||
|
||||
class ExtractionIn(BaseModel):
|
||||
extraction_class: str
|
||||
extraction_text: str
|
||||
attributes: dict[str, Any] = Field(default_factory=dict)
|
||||
|
||||
|
||||
class ExampleIn(BaseModel):
|
||||
text: str
|
||||
extractions: list[ExtractionIn]
|
||||
|
||||
|
||||
class ExtractRequest(BaseModel):
|
||||
text: str
|
||||
prompt_description: str
|
||||
examples: list[ExampleIn]
|
||||
model_id: str
|
||||
model_url: str | None = None
|
||||
extraction_passes: int | None = None
|
||||
max_workers: int | None = None
|
||||
max_char_buffer: int | None = None
|
||||
api_key: str | None = Field(
|
||||
default=None,
|
||||
description="Optional Gemini / cloud key; else LANGEXTRACT_API_KEY from env.",
|
||||
)
|
||||
fence_output: bool | None = None
|
||||
use_schema_constraints: bool | None = None
|
||||
|
||||
|
||||
def _normalize_attributes(
|
||||
attrs: dict[str, Any],
|
||||
) -> dict[str, str | list[str]] | None:
|
||||
if not attrs:
|
||||
return None
|
||||
out: dict[str, str | list[str]] = {}
|
||||
for k, v in attrs.items():
|
||||
if isinstance(v, list):
|
||||
out[k] = [str(x) for x in v]
|
||||
else:
|
||||
out[k] = str(v)
|
||||
return out
|
||||
|
||||
|
||||
def _examples_to_lx(examples: list[ExampleIn]) -> list[lx.data.ExampleData]:
|
||||
out: list[lx.data.ExampleData] = []
|
||||
for ex in examples:
|
||||
extractions = [
|
||||
lx.data.Extraction(
|
||||
extraction_class=e.extraction_class,
|
||||
extraction_text=e.extraction_text,
|
||||
attributes=_normalize_attributes(e.attributes),
|
||||
)
|
||||
for e in ex.extractions
|
||||
]
|
||||
out.append(lx.data.ExampleData(text=ex.text, extractions=extractions))
|
||||
return out
|
||||
|
||||
|
||||
def _extraction_to_dict(e: Any) -> dict[str, Any]:
|
||||
d: dict[str, Any] = {
|
||||
"extraction_class": getattr(e, "extraction_class", None),
|
||||
"extraction_text": getattr(e, "extraction_text", None),
|
||||
"attributes": dict(getattr(e, "attributes", {}) or {}),
|
||||
}
|
||||
interval = getattr(e, "char_interval", None)
|
||||
if interval is not None:
|
||||
start = getattr(interval, "start", None)
|
||||
end = getattr(interval, "end", None)
|
||||
if start is not None and end is not None:
|
||||
d["char_interval"] = {"start": start, "end": end}
|
||||
return d
|
||||
|
||||
|
||||
def _document_to_dict(doc: Any) -> dict[str, Any]:
|
||||
extractions = getattr(doc, "extractions", None) or []
|
||||
return {
|
||||
"extractions": [_extraction_to_dict(x) for x in extractions],
|
||||
}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
def health() -> dict[str, str]:
|
||||
return {"status": "ok"}
|
||||
|
||||
|
||||
@app.post("/extract", dependencies=[Depends(verify_service_token)])
|
||||
def extract(req: ExtractRequest) -> dict[str, Any]:
|
||||
examples = _examples_to_lx(req.examples)
|
||||
kwargs: dict[str, Any] = {
|
||||
"text_or_documents": req.text,
|
||||
"prompt_description": req.prompt_description,
|
||||
"examples": examples,
|
||||
"model_id": req.model_id,
|
||||
}
|
||||
if req.model_url is not None:
|
||||
kwargs["model_url"] = req.model_url
|
||||
if req.extraction_passes is not None:
|
||||
kwargs["extraction_passes"] = req.extraction_passes
|
||||
if req.max_workers is not None:
|
||||
kwargs["max_workers"] = req.max_workers
|
||||
if req.max_char_buffer is not None:
|
||||
kwargs["max_char_buffer"] = req.max_char_buffer
|
||||
if req.api_key is not None:
|
||||
kwargs["api_key"] = req.api_key
|
||||
if req.fence_output is not None:
|
||||
kwargs["fence_output"] = req.fence_output
|
||||
if req.use_schema_constraints is not None:
|
||||
kwargs["use_schema_constraints"] = req.use_schema_constraints
|
||||
|
||||
try:
|
||||
result = lx.extract(**kwargs)
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=400, detail=str(e)) from e
|
||||
|
||||
if isinstance(result, list):
|
||||
return {"documents": [_document_to_dict(d) for d in result]}
|
||||
return {"documents": [_document_to_dict(result)]}
|
||||
4
services/langextract-api/requirements.txt
Normal file
4
services/langextract-api/requirements.txt
Normal file
@ -0,0 +1,4 @@
|
||||
fastapi>=0.115.0
|
||||
uvicorn[standard]>=0.32.0
|
||||
langextract>=1.0.0
|
||||
pydantic>=2.0.0
|
||||
10
services/local-office/.env.example
Normal file
10
services/local-office/.env.example
Normal file
@ -0,0 +1,10 @@
|
||||
# API key for third-party apps (required; set in .secrets/<env>/ or env)
|
||||
API_KEYS=key1,key2
|
||||
# Storage path for document files (default: ./data/files)
|
||||
STORAGE_PATH=./data/files
|
||||
# SQLite DB path for metadata (default: ./data/local_office.db)
|
||||
DATABASE_PATH=./data/local_office.db
|
||||
# Max upload size in bytes (default: 20MB)
|
||||
MAX_UPLOAD_BYTES=20971520
|
||||
# Rate limit: requests per minute per API key (default: 60)
|
||||
RATE_LIMIT_PER_MINUTE=60
|
||||
18
services/local-office/.gitignore
vendored
Normal file
18
services/local-office/.gitignore
vendored
Normal file
@ -0,0 +1,18 @@
|
||||
# Data and secrets
|
||||
data/
|
||||
.env
|
||||
.secrets/
|
||||
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
.venv/
|
||||
venv/
|
||||
*.egg-info/
|
||||
.eggs/
|
||||
|
||||
# IDE
|
||||
.idea/
|
||||
.vscode/
|
||||
*.swp
|
||||
*~
|
||||
42
services/local-office/README.md
Normal file
42
services/local-office/README.md
Normal file
@ -0,0 +1,42 @@
|
||||
# Local Office
|
||||
|
||||
**Intégration monorepo** : ce code provient de l’ancien dépôt `git.4nkweb.com/4nk/local_office`, **fusionné dans `smart_ide`** sous **`services/local-office/`** (service HTTP au même titre que les autres dossiers de `services/`). Le dépôt distant peut être supprimé ; l’historique Git d’origine n’est pas conservé dans ce chemin (copie de fichiers).
|
||||
|
||||
Documentation projet : [docs/features/local-office.md](../docs/features/local-office.md) · [docs/services.md](../docs/services.md).
|
||||
|
||||
API for third-party applications to upload and edit Office documents (docx, xlsx, pptx) on this machine.
|
||||
|
||||
## Architecture
|
||||
|
||||
See [docs/architecture-proposal.md](docs/architecture-proposal.md).
|
||||
|
||||
## Run on this machine
|
||||
|
||||
1. Create a virtualenv and install dependencies:
|
||||
|
||||
```bash
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate # or .venv\Scripts\activate on Windows
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
2. Set environment variables (no secrets in repo). Copy `.env.example` to `.env` and set at least `API_KEYS`. For a quick local run you can use `export API_KEYS=dev-key`.
|
||||
|
||||
3. Run the API:
|
||||
|
||||
```bash
|
||||
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
4. Open http://localhost:8000/docs for Swagger UI.
|
||||
|
||||
## API (summary)
|
||||
|
||||
- **POST /documents** — Upload file (multipart). Header: `X-API-Key`. Request must send correct `Content-Type` for the part (e.g. docx: `application/vnd.openxmlformats-officedocument.wordprocessingml.document`). Returns `document_id`.
|
||||
- **GET /documents** — List documents for the API key.
|
||||
- **GET /documents/{id}** — Metadata.
|
||||
- **GET /documents/{id}/file** — Download file.
|
||||
- **POST /documents/{id}/commands** — Apply commands (docx: `replaceText`, `insertParagraph`). Body: `{"commands": [{"type": "replaceText", "search": "foo", "replace": "bar"}]}`.
|
||||
- **DELETE /documents/{id}** — Delete document and file.
|
||||
|
||||
All routes require header `X-API-Key` and are rate-limited per key.
|
||||
1
services/local-office/app/__init__.py
Normal file
1
services/local-office/app/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# local_office API application
|
||||
1
services/local-office/app/api/__init__.py
Normal file
1
services/local-office/app/api/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# API routes
|
||||
1
services/local-office/app/api/routes/__init__.py
Normal file
1
services/local-office/app/api/routes/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# Document routes
|
||||
156
services/local-office/app/api/routes/documents.py
Normal file
156
services/local-office/app/api/routes/documents.py
Normal file
@ -0,0 +1,156 @@
|
||||
"""Document routes: upload, get file, list, delete, commands. Auth and rate limit on each."""
|
||||
import logging
|
||||
from typing import Annotated
|
||||
|
||||
from fastapi import APIRouter, Depends, File, HTTPException, Request, UploadFile
|
||||
from fastapi.responses import FileResponse
|
||||
|
||||
from app.api.schemas import CommandsRequest, commands_to_dicts
|
||||
from app.auth import require_api_key
|
||||
from app.config import get_max_upload_bytes
|
||||
from app.engine.commands import apply_commands
|
||||
from app.limiter import limiter, rate_limit_string
|
||||
from app.storage import file_storage
|
||||
from app.storage.file_storage import delete_document_file, read_document, write_document
|
||||
from app.storage.metadata import (
|
||||
delete_document_metadata,
|
||||
generate_document_id,
|
||||
get_document,
|
||||
insert_document,
|
||||
list_documents,
|
||||
update_document_size,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter()
|
||||
rate = rate_limit_string()
|
||||
|
||||
ALLOWED_MIME = {
|
||||
"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
|
||||
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
|
||||
"application/vnd.openxmlformats-officedocument.presentationml.presentation",
|
||||
}
|
||||
|
||||
|
||||
def _check_owner(meta: dict | None, api_key_id: str, document_id: str) -> None:
|
||||
if meta is None:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
if meta.get("api_key_id") != api_key_id:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
|
||||
|
||||
@router.post("", status_code=201, response_model=dict)
|
||||
@limiter.limit(rate)
|
||||
async def upload_document(
|
||||
request: Request,
|
||||
api_key_id: Annotated[str, Depends(require_api_key)],
|
||||
file: Annotated[UploadFile, File()],
|
||||
) -> dict:
|
||||
"""Upload an Office file. Returns document_id."""
|
||||
content = await file.read()
|
||||
if len(content) > get_max_upload_bytes():
|
||||
raise HTTPException(
|
||||
status_code=413,
|
||||
detail="File too large",
|
||||
)
|
||||
mime = (file.content_type or "").strip().split(";")[0]
|
||||
if mime not in ALLOWED_MIME:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail=f"Unsupported file type. Allowed: {sorted(ALLOWED_MIME)}",
|
||||
)
|
||||
name = file.filename or "document"
|
||||
document_id = generate_document_id()
|
||||
write_document(document_id, content)
|
||||
insert_document(
|
||||
document_id=document_id,
|
||||
api_key_id=api_key_id,
|
||||
name=name,
|
||||
mime_type=mime,
|
||||
size=len(content),
|
||||
)
|
||||
logger.info("Uploaded document %s for key %s", document_id, api_key_id)
|
||||
return {"document_id": document_id, "name": name, "mime_type": mime, "size": len(content)}
|
||||
|
||||
|
||||
@router.get("")
|
||||
@limiter.limit(rate)
|
||||
async def list_docs(
|
||||
request: Request,
|
||||
api_key_id: Annotated[str, Depends(require_api_key)],
|
||||
) -> list:
|
||||
"""List documents for the authenticated API key."""
|
||||
return list_documents(api_key_id)
|
||||
|
||||
|
||||
@router.get("/{document_id}")
|
||||
@limiter.limit(rate)
|
||||
async def get_metadata(
|
||||
request: Request,
|
||||
document_id: str,
|
||||
api_key_id: Annotated[str, Depends(require_api_key)],
|
||||
) -> dict:
|
||||
"""Get document metadata."""
|
||||
meta = get_document(document_id)
|
||||
_check_owner(meta, api_key_id, document_id)
|
||||
return meta
|
||||
|
||||
|
||||
@router.get("/{document_id}/file")
|
||||
@limiter.limit(rate)
|
||||
async def download_file(
|
||||
request: Request,
|
||||
document_id: str,
|
||||
api_key_id: Annotated[str, Depends(require_api_key)],
|
||||
):
|
||||
"""Download document file."""
|
||||
meta = get_document(document_id)
|
||||
_check_owner(meta, api_key_id, document_id)
|
||||
path = file_storage.get_document_path(document_id)
|
||||
if not path.is_file():
|
||||
raise HTTPException(status_code=404, detail="Document file not found")
|
||||
return FileResponse(
|
||||
path,
|
||||
media_type=meta.get("mime_type", "application/octet-stream"),
|
||||
filename=meta.get("name", "document"),
|
||||
)
|
||||
|
||||
|
||||
@router.post("/{document_id}/commands")
|
||||
@limiter.limit(rate)
|
||||
async def apply_document_commands(
|
||||
request: Request,
|
||||
document_id: str,
|
||||
body: CommandsRequest,
|
||||
api_key_id: Annotated[str, Depends(require_api_key)],
|
||||
) -> dict:
|
||||
"""Apply commands to document. Supported for docx: replaceText, insertParagraph."""
|
||||
meta = get_document(document_id)
|
||||
_check_owner(meta, api_key_id, document_id)
|
||||
content = read_document(document_id)
|
||||
mime = meta.get("mime_type", "")
|
||||
try:
|
||||
new_content = apply_commands(content, mime, commands_to_dicts(body.commands))
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
write_document(document_id, new_content)
|
||||
update_document_size(document_id, len(new_content))
|
||||
logger.info("Applied %d commands to document %s", len(body.commands), document_id)
|
||||
return {"document_id": document_id, "size": len(new_content)}
|
||||
|
||||
|
||||
@router.delete("/{document_id}", status_code=204)
|
||||
@limiter.limit(rate)
|
||||
async def delete_doc(
|
||||
request: Request,
|
||||
document_id: str,
|
||||
api_key_id: Annotated[str, Depends(require_api_key)],
|
||||
) -> None:
|
||||
"""Delete document and its file."""
|
||||
meta = get_document(document_id)
|
||||
_check_owner(meta, api_key_id, document_id)
|
||||
deleted = delete_document_metadata(document_id)
|
||||
delete_document_file(document_id)
|
||||
if not deleted:
|
||||
raise HTTPException(status_code=404, detail="Document not found")
|
||||
35
services/local-office/app/api/schemas.py
Normal file
35
services/local-office/app/api/schemas.py
Normal file
@ -0,0 +1,35 @@
|
||||
"""Request/response schemas."""
|
||||
from typing import Any
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
|
||||
class CommandItem(BaseModel):
|
||||
"""One command: type + params."""
|
||||
type: str = Field(..., description="replaceText | insertParagraph | ...")
|
||||
search: str | None = None
|
||||
replace: str | None = None
|
||||
text: str | None = None
|
||||
position: str | None = None
|
||||
|
||||
|
||||
class CommandsRequest(BaseModel):
|
||||
"""Body for POST /documents/:id/commands."""
|
||||
commands: list[CommandItem] = Field(..., min_length=1)
|
||||
|
||||
|
||||
def commands_to_dicts(items: list[CommandItem]) -> list[dict[str, Any]]:
|
||||
"""Convert to list of dicts for engine (only non-null fields)."""
|
||||
out: list[dict[str, Any]] = []
|
||||
for c in items:
|
||||
d: dict[str, Any] = {"type": c.type}
|
||||
if c.search is not None:
|
||||
d["search"] = c.search
|
||||
if c.replace is not None:
|
||||
d["replace"] = c.replace
|
||||
if c.text is not None:
|
||||
d["text"] = c.text
|
||||
if c.position is not None:
|
||||
d["position"] = c.position
|
||||
out.append(d)
|
||||
return out
|
||||
29
services/local-office/app/auth.py
Normal file
29
services/local-office/app/auth.py
Normal file
@ -0,0 +1,29 @@
|
||||
"""API key authentication. No fallback: missing or invalid key returns 401."""
|
||||
import logging
|
||||
from typing import Annotated
|
||||
|
||||
from fastapi import Header, HTTPException
|
||||
|
||||
from app.config import get_api_keys
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
HEADER = "X-API-Key"
|
||||
|
||||
|
||||
def _valid_keys() -> list[str]:
|
||||
return get_api_keys()
|
||||
|
||||
|
||||
def require_api_key(
|
||||
x_api_key: Annotated[str | None, Header(alias=HEADER)] = None,
|
||||
) -> str:
|
||||
"""Dependency: validate X-API-Key header and return the key id (same value)."""
|
||||
if not x_api_key or not x_api_key.strip():
|
||||
logger.warning("Missing %s header", HEADER)
|
||||
raise HTTPException(status_code=401, detail="Missing API key")
|
||||
key = x_api_key.strip()
|
||||
if key not in _valid_keys():
|
||||
logger.warning("Invalid API key attempt")
|
||||
raise HTTPException(status_code=401, detail="Invalid API key")
|
||||
return key
|
||||
46
services/local-office/app/config.py
Normal file
46
services/local-office/app/config.py
Normal file
@ -0,0 +1,46 @@
|
||||
"""Load configuration from environment. No secrets in repo."""
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
load_dotenv()
|
||||
|
||||
|
||||
def _env(key: str, default: str | None = None) -> str:
|
||||
val = os.environ.get(key)
|
||||
if val is not None:
|
||||
return val.strip()
|
||||
if default is not None:
|
||||
return default
|
||||
raise ValueError(f"Missing required env: {key}")
|
||||
|
||||
|
||||
def get_api_keys() -> list[str]:
|
||||
"""Comma-separated API keys. At least one required."""
|
||||
raw = _env("API_KEYS", "")
|
||||
if not raw:
|
||||
raise ValueError("API_KEYS must be set (comma-separated list)")
|
||||
return [k.strip() for k in raw.split(",") if k.strip()]
|
||||
|
||||
|
||||
def get_storage_path() -> Path:
|
||||
"""Directory for document files."""
|
||||
p = Path(_env("STORAGE_PATH", "./data/files"))
|
||||
return p.resolve()
|
||||
|
||||
|
||||
def get_database_path() -> Path:
|
||||
"""SQLite database path for metadata."""
|
||||
p = Path(_env("DATABASE_PATH", "./data/local_office.db"))
|
||||
return p.resolve()
|
||||
|
||||
|
||||
def get_max_upload_bytes() -> int:
|
||||
"""Max upload size in bytes."""
|
||||
return int(_env("MAX_UPLOAD_BYTES", "20971520"))
|
||||
|
||||
|
||||
def get_rate_limit_per_minute() -> int:
|
||||
"""Rate limit per API key per minute."""
|
||||
return int(_env("RATE_LIMIT_PER_MINUTE", "60"))
|
||||
1
services/local-office/app/engine/__init__.py
Normal file
1
services/local-office/app/engine/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# Command engine for Office documents
|
||||
25
services/local-office/app/engine/commands.py
Normal file
25
services/local-office/app/engine/commands.py
Normal file
@ -0,0 +1,25 @@
|
||||
"""Command runner: dispatch by mime type to docx/xlsx/pptx engine. No fallback."""
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
from app.engine.docx_editor import apply_commands_docx
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
DOCX_MIME = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
|
||||
XLSX_MIME = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
|
||||
PPTX_MIME = "application/vnd.openxmlformats-officedocument.presentationml.presentation"
|
||||
|
||||
|
||||
def apply_commands(content: bytes, mime_type: str, commands: list[dict[str, Any]]) -> bytes:
|
||||
"""
|
||||
Apply commands to document content. Returns new content.
|
||||
Raises ValueError for unknown mime or command type.
|
||||
"""
|
||||
if mime_type == DOCX_MIME:
|
||||
return apply_commands_docx(content, commands)
|
||||
if mime_type == XLSX_MIME:
|
||||
raise ValueError("xlsx commands not implemented yet")
|
||||
if mime_type == PPTX_MIME:
|
||||
raise ValueError("pptx commands not implemented yet")
|
||||
raise ValueError(f"Unsupported mime type for commands: {mime_type}")
|
||||
90
services/local-office/app/engine/docx_editor.py
Normal file
90
services/local-office/app/engine/docx_editor.py
Normal file
@ -0,0 +1,90 @@
|
||||
"""Apply document commands to a docx using python-docx. No fallback: raises on error."""
|
||||
import io
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
from docx import Document
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def apply_replace_text(doc: Document, search: str, replace: str) -> None:
|
||||
"""Replace first occurrence of search with replace in all paragraphs."""
|
||||
if not search:
|
||||
raise ValueError("replaceText: search must be non-empty")
|
||||
for paragraph in doc.paragraphs:
|
||||
if search in paragraph.text:
|
||||
for run in paragraph.runs:
|
||||
if search in run.text:
|
||||
run.text = run.text.replace(search, replace, 1)
|
||||
return
|
||||
paragraph.text = paragraph.text.replace(search, replace, 1)
|
||||
return
|
||||
for table in doc.tables:
|
||||
for row in table.rows:
|
||||
for cell in row.cells:
|
||||
for paragraph in cell.paragraphs:
|
||||
if search in paragraph.text:
|
||||
for run in paragraph.runs:
|
||||
if search in run.text:
|
||||
run.text = run.text.replace(search, replace, 1)
|
||||
return
|
||||
paragraph.text = paragraph.text.replace(search, replace, 1)
|
||||
return
|
||||
logger.warning("replaceText: search string not found: %s", repr(search[:50]))
|
||||
|
||||
|
||||
def apply_insert_paragraph(
|
||||
doc: Document,
|
||||
text: str,
|
||||
position: str = "end",
|
||||
) -> None:
|
||||
"""Insert a paragraph. position: 'end' (default) or 'start'."""
|
||||
new_para = doc.add_paragraph(text)
|
||||
if position == "start" and len(doc.paragraphs) > 1:
|
||||
# Move the new paragraph to the start (python-docx adds at end)
|
||||
body = doc.element.body
|
||||
new_el = new_para._element
|
||||
body.remove(new_el)
|
||||
body.insert(0, new_el)
|
||||
elif position != "end" and position != "start":
|
||||
raise ValueError("insertParagraph: position must be 'start' or 'end'")
|
||||
|
||||
|
||||
def load_docx(content: bytes) -> Document:
|
||||
"""Load docx from bytes."""
|
||||
return Document(io.BytesIO(content))
|
||||
|
||||
|
||||
def save_docx(doc: Document) -> bytes:
|
||||
"""Save docx to bytes."""
|
||||
buf = io.BytesIO()
|
||||
doc.save(buf)
|
||||
buf.seek(0)
|
||||
return buf.read()
|
||||
|
||||
|
||||
def apply_commands_docx(content: bytes, commands: list[dict[str, Any]]) -> bytes:
|
||||
"""
|
||||
Apply a list of commands to docx content. Returns new content.
|
||||
Commands: { "type": "replaceText", "search": "...", "replace": "..." }
|
||||
{ "type": "insertParagraph", "text": "...", "position": "end"|"start" }
|
||||
"""
|
||||
doc = load_docx(content)
|
||||
for cmd in commands:
|
||||
ctype = cmd.get("type")
|
||||
if ctype == "replaceText":
|
||||
apply_replace_text(
|
||||
doc,
|
||||
search=str(cmd.get("search", "")),
|
||||
replace=str(cmd.get("replace", "")),
|
||||
)
|
||||
elif ctype == "insertParagraph":
|
||||
apply_insert_paragraph(
|
||||
doc,
|
||||
text=str(cmd.get("text", "")),
|
||||
position=str(cmd.get("position", "end")),
|
||||
)
|
||||
else:
|
||||
raise ValueError(f"Unknown command type: {ctype}")
|
||||
return save_docx(doc)
|
||||
18
services/local-office/app/limiter.py
Normal file
18
services/local-office/app/limiter.py
Normal file
@ -0,0 +1,18 @@
|
||||
"""Rate limiter keyed by X-API-Key (or IP if missing). Used by main and routes."""
|
||||
from app.config import get_rate_limit_per_minute
|
||||
from slowapi import Limiter
|
||||
from slowapi.util import get_remote_address
|
||||
|
||||
|
||||
def _key_func(request) -> str:
|
||||
api_key = request.headers.get("X-API-Key") or get_remote_address(request)
|
||||
return str(api_key)
|
||||
|
||||
|
||||
limiter = Limiter(key_func=_key_func)
|
||||
|
||||
|
||||
def rate_limit_string() -> str:
|
||||
"""e.g. '60/minute' from config."""
|
||||
n = get_rate_limit_per_minute()
|
||||
return f"{n}/minute"
|
||||
54
services/local-office/app/main.py
Normal file
54
services/local-office/app/main.py
Normal file
@ -0,0 +1,54 @@
|
||||
"""FastAPI app: CORS, rate limit, router. No fallback on auth."""
|
||||
import logging
|
||||
from contextlib import asynccontextmanager
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from slowapi import _rate_limit_exceeded_handler
|
||||
from slowapi.errors import RateLimitExceeded
|
||||
|
||||
from app.api.routes import documents
|
||||
from app.auth import require_api_key
|
||||
from app.storage.metadata import init_db
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
"""Init DB on startup."""
|
||||
init_db()
|
||||
logger.info("Storage initialized")
|
||||
yield
|
||||
logger.info("Shutdown")
|
||||
|
||||
|
||||
app = FastAPI(
|
||||
title="Local Office API",
|
||||
description="API for third-party apps to upload and edit Office documents",
|
||||
lifespan=lifespan,
|
||||
)
|
||||
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["*"],
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
|
||||
from app.limiter import limiter
|
||||
|
||||
app.state.limiter = limiter
|
||||
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
|
||||
|
||||
app.include_router(
|
||||
documents.router,
|
||||
prefix="/documents",
|
||||
tags=["documents"],
|
||||
dependencies=[], # auth done per-route to allow public docs later if needed
|
||||
)
|
||||
1
services/local-office/app/storage/__init__.py
Normal file
1
services/local-office/app/storage/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# Storage: file store and metadata
|
||||
48
services/local-office/app/storage/file_storage.py
Normal file
48
services/local-office/app/storage/file_storage.py
Normal file
@ -0,0 +1,48 @@
|
||||
"""Local file storage for documents. Path derived from document_id."""
|
||||
import logging
|
||||
from pathlib import Path
|
||||
|
||||
from app.config import get_storage_path
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def _ensure_storage() -> Path:
|
||||
root = get_storage_path()
|
||||
root.mkdir(parents=True, exist_ok=True)
|
||||
return root
|
||||
|
||||
|
||||
def get_document_path(document_id: str) -> Path:
|
||||
"""Path to the file for document_id. Does not create parent."""
|
||||
root = _ensure_storage()
|
||||
return root / document_id
|
||||
|
||||
|
||||
def write_document(document_id: str, content: bytes) -> None:
|
||||
"""Write document bytes. Overwrites if exists."""
|
||||
path = get_document_path(document_id)
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_bytes(content)
|
||||
logger.info("Wrote document %s (%d bytes)", document_id, len(content))
|
||||
|
||||
|
||||
def read_document(document_id: str) -> bytes:
|
||||
"""Read document bytes. Raises FileNotFoundError if missing."""
|
||||
path = get_document_path(document_id)
|
||||
if not path.is_file():
|
||||
raise FileNotFoundError(f"Document not found: {document_id}")
|
||||
return path.read_bytes()
|
||||
|
||||
|
||||
def delete_document_file(document_id: str) -> None:
|
||||
"""Remove document file. No-op if missing."""
|
||||
path = get_document_path(document_id)
|
||||
if path.is_file():
|
||||
path.unlink()
|
||||
logger.info("Deleted document file %s", document_id)
|
||||
|
||||
|
||||
def document_file_exists(document_id: str) -> bool:
|
||||
"""Return True if file exists."""
|
||||
return get_document_path(document_id).is_file()
|
||||
121
services/local-office/app/storage/metadata.py
Normal file
121
services/local-office/app/storage/metadata.py
Normal file
@ -0,0 +1,121 @@
|
||||
"""SQLite metadata: document_id, api_key_id, name, mime_type, size, created_at."""
|
||||
import datetime
|
||||
import logging
|
||||
import sqlite3
|
||||
import uuid
|
||||
from typing import Any
|
||||
|
||||
from app.config import get_database_path
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
TABLE = """
|
||||
CREATE TABLE IF NOT EXISTS documents (
|
||||
document_id TEXT PRIMARY KEY,
|
||||
api_key_id TEXT NOT NULL,
|
||||
name TEXT NOT NULL,
|
||||
mime_type TEXT NOT NULL,
|
||||
size INTEGER NOT NULL,
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS idx_documents_api_key ON documents(api_key_id);
|
||||
"""
|
||||
|
||||
|
||||
def _connect() -> sqlite3.Connection:
|
||||
db_path = get_database_path()
|
||||
db_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
conn = sqlite3.connect(str(db_path))
|
||||
conn.row_factory = sqlite3.Row
|
||||
return conn
|
||||
|
||||
|
||||
def init_db() -> None:
|
||||
"""Create table and index if not exist."""
|
||||
conn = _connect()
|
||||
try:
|
||||
conn.executescript(TABLE)
|
||||
conn.commit()
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def _row_to_dict(row: sqlite3.Row) -> dict[str, Any]:
|
||||
return dict(row) if row else {}
|
||||
|
||||
|
||||
def insert_document(
|
||||
document_id: str,
|
||||
api_key_id: str,
|
||||
name: str,
|
||||
mime_type: str,
|
||||
size: int,
|
||||
) -> None:
|
||||
"""Insert one document metadata row."""
|
||||
now = datetime.datetime.utcnow().isoformat() + "Z"
|
||||
conn = _connect()
|
||||
try:
|
||||
conn.execute(
|
||||
"INSERT INTO documents (document_id, api_key_id, name, mime_type, size, created_at) VALUES (?,?,?,?,?,?)",
|
||||
(document_id, api_key_id, name, mime_type, size, now),
|
||||
)
|
||||
conn.commit()
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def get_document(document_id: str) -> dict[str, Any] | None:
|
||||
"""Get metadata by document_id."""
|
||||
conn = _connect()
|
||||
try:
|
||||
cur = conn.execute(
|
||||
"SELECT document_id, api_key_id, name, mime_type, size, created_at FROM documents WHERE document_id = ?",
|
||||
(document_id,),
|
||||
)
|
||||
row = cur.fetchone()
|
||||
return _row_to_dict(row) if row else None
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def list_documents(api_key_id: str) -> list[dict[str, Any]]:
|
||||
"""List documents for an API key."""
|
||||
conn = _connect()
|
||||
try:
|
||||
cur = conn.execute(
|
||||
"SELECT document_id, api_key_id, name, mime_type, size, created_at FROM documents WHERE api_key_id = ? ORDER BY created_at DESC",
|
||||
(api_key_id,),
|
||||
)
|
||||
return [_row_to_dict(row) for row in cur.fetchall()]
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def update_document_size(document_id: str, size: int) -> bool:
|
||||
"""Update size for document. Returns True if row updated."""
|
||||
conn = _connect()
|
||||
try:
|
||||
cur = conn.execute(
|
||||
"UPDATE documents SET size = ? WHERE document_id = ?",
|
||||
(size, document_id),
|
||||
)
|
||||
conn.commit()
|
||||
return cur.rowcount > 0
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def delete_document_metadata(document_id: str) -> bool:
|
||||
"""Delete metadata row. Returns True if a row was deleted."""
|
||||
conn = _connect()
|
||||
try:
|
||||
cur = conn.execute("DELETE FROM documents WHERE document_id = ?", (document_id,))
|
||||
conn.commit()
|
||||
return cur.rowcount > 0
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def generate_document_id() -> str:
|
||||
"""New unique document id (UUID4)."""
|
||||
return str(uuid.uuid4())
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user