- Add ia_dev submodule (projects/smart_ide on forge 4nk) - Document APIs, orchestrator, gateway, local-office, rollout - Add systemd/scripts layout; relocate setup scripts - Remove obsolete nginx/enso-only docs from this repo scope
50 lines
2.1 KiB
Markdown
50 lines
2.1 KiB
Markdown
# agent-regex-search-api
|
||
|
||
Local HTTP API on **`127.0.0.1`** for **regex search over files** using [ripgrep](https://github.com/BurntSushi/ripgrep) (`rg`). Results are returned as structured JSON.
|
||
|
||
This is **not** the closed-source “instant grep” index described in Cursor’s article ([Recherche regex rapide](https://cursor.com/fr/blog/fast-regex-search)); it is a **local, open** approach (ripgrep) with the same high-level goal: fast agent-oriented code search. For monorepos at extreme scale, consider adding **Zoekt** or another indexed backend later (see feature doc).
|
||
|
||
## Prerequisites
|
||
|
||
- `rg` available in `PATH` (e.g. `sudo apt install ripgrep` on Debian/Ubuntu).
|
||
|
||
## Environment
|
||
|
||
| Variable | Required | Description |
|
||
|----------|----------|-------------|
|
||
| `REGEX_SEARCH_TOKEN` | yes | `Authorization: Bearer <token>` on every request except `GET /health`. |
|
||
| `REGEX_SEARCH_ROOT` | no | Absolute base directory searches are confined to (default `/home/ncantu/code`). |
|
||
| `REGEX_SEARCH_HOST` | no | Bind address (default `127.0.0.1`). |
|
||
| `REGEX_SEARCH_PORT` | no | Port (default `37143`). |
|
||
|
||
## Endpoints
|
||
|
||
- `GET /health` — liveness; includes configured `root` path.
|
||
- `POST /search` — JSON body:
|
||
- `pattern` (string, required): Rust regex passed to ripgrep.
|
||
- `subpath` (string, optional): path **relative** to `REGEX_SEARCH_ROOT` (no `..`, no absolute paths).
|
||
- `maxMatches` (number, optional): cap on matches (default `500`, max `50000`).
|
||
- `timeoutMs` (number, optional): kill `rg` after this many ms (default `60000`, max `300000`).
|
||
|
||
Response: `{ root, target, matches: [{ path, lineNumber, line }], truncated, exitCode }`.
|
||
|
||
Ripgrep exit code `1` means “no matches” and is still returned as **200** with an empty `matches` array when no other error occurred.
|
||
|
||
## Run
|
||
|
||
```bash
|
||
npm install
|
||
npm run build
|
||
export REGEX_SEARCH_TOKEN='…'
|
||
npm start
|
||
```
|
||
|
||
## Risks
|
||
|
||
- **ReDoS**: pathological regexes can burn CPU until `timeoutMs`. Keep timeouts conservative for shared hosts.
|
||
- **Scope**: all readable files under `target` that ripgrep traverses may be searched; align `REGEX_SEARCH_ROOT` with policy.
|
||
|
||
## License
|
||
|
||
MIT.
|