Nicolas Cantu 088eab84b7 Platform docs, services, ia_dev submodule, smart_ide project config
- Add ia_dev submodule (projects/smart_ide on forge 4nk)
- Document APIs, orchestrator, gateway, local-office, rollout
- Add systemd/scripts layout; relocate setup scripts
- Remove obsolete nginx/enso-only docs from this repo scope
2026-04-03 16:07:58 +02:00

50 lines
2.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# agent-regex-search-api
Local HTTP API on **`127.0.0.1`** for **regex search over files** using [ripgrep](https://github.com/BurntSushi/ripgrep) (`rg`). Results are returned as structured JSON.
This is **not** the closed-source “instant grep” index described in Cursors article ([Recherche regex rapide](https://cursor.com/fr/blog/fast-regex-search)); it is a **local, open** approach (ripgrep) with the same high-level goal: fast agent-oriented code search. For monorepos at extreme scale, consider adding **Zoekt** or another indexed backend later (see feature doc).
## Prerequisites
- `rg` available in `PATH` (e.g. `sudo apt install ripgrep` on Debian/Ubuntu).
## Environment
| Variable | Required | Description |
|----------|----------|-------------|
| `REGEX_SEARCH_TOKEN` | yes | `Authorization: Bearer <token>` on every request except `GET /health`. |
| `REGEX_SEARCH_ROOT` | no | Absolute base directory searches are confined to (default `/home/ncantu/code`). |
| `REGEX_SEARCH_HOST` | no | Bind address (default `127.0.0.1`). |
| `REGEX_SEARCH_PORT` | no | Port (default `37143`). |
## Endpoints
- `GET /health` — liveness; includes configured `root` path.
- `POST /search` — JSON body:
- `pattern` (string, required): Rust regex passed to ripgrep.
- `subpath` (string, optional): path **relative** to `REGEX_SEARCH_ROOT` (no `..`, no absolute paths).
- `maxMatches` (number, optional): cap on matches (default `500`, max `50000`).
- `timeoutMs` (number, optional): kill `rg` after this many ms (default `60000`, max `300000`).
Response: `{ root, target, matches: [{ path, lineNumber, line }], truncated, exitCode }`.
Ripgrep exit code `1` means “no matches” and is still returned as **200** with an empty `matches` array when no other error occurred.
## Run
```bash
npm install
npm run build
export REGEX_SEARCH_TOKEN='…'
npm start
```
## Risks
- **ReDoS**: pathological regexes can burn CPU until `timeoutMs`. Keep timeouts conservative for shared hosts.
- **Scope**: all readable files under `target` that ripgrep traverses may be searched; align `REGEX_SEARCH_ROOT` with policy.
## License
MIT.