- Add ia_dev submodule (projects/smart_ide on forge 4nk) - Document APIs, orchestrator, gateway, local-office, rollout - Add systemd/scripts layout; relocate setup scripts - Remove obsolete nginx/enso-only docs from this repo scope
49 lines
1.8 KiB
Markdown
49 lines
1.8 KiB
Markdown
# langextract-api
|
|
|
|
Local HTTP API on **`127.0.0.1`** wrapping [google/langextract](https://github.com/google/langextract): structured extractions from unstructured text with optional character grounding.
|
|
|
|
## Environment
|
|
|
|
| Variable | Required | Description |
|
|
|----------|----------|-------------|
|
|
| `LANGEXTRACT_SERVICE_TOKEN` | no | If set, every request must send `Authorization: Bearer <token>`. |
|
|
| `LANGEXTRACT_API_HOST` | no | Bind address (default `127.0.0.1`). |
|
|
| `LANGEXTRACT_API_PORT` | no | Port (default `37141`). |
|
|
| `LANGEXTRACT_API_KEY` | no | Used by LangExtract for cloud models (e.g. Gemini) when the client does not pass `api_key` in the JSON body. See upstream docs. |
|
|
|
|
## Endpoints
|
|
|
|
- `GET /health` — liveness.
|
|
- `POST /extract` — run extraction. JSON body matches [LangExtract](https://github.com/google/langextract) `extract()` parameters where applicable: `text`, `prompt_description`, `examples`, `model_id`, optional `model_url` (Ollama), `extraction_passes`, `max_workers`, `max_char_buffer`, `api_key`, `fence_output`, `use_schema_constraints`.
|
|
|
|
Example `examples` item:
|
|
|
|
```json
|
|
{
|
|
"text": "ROMEO. But soft!",
|
|
"extractions": [
|
|
{
|
|
"extraction_class": "character",
|
|
"extraction_text": "ROMEO",
|
|
"attributes": {}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Run
|
|
|
|
```bash
|
|
python3 -m venv .venv
|
|
source .venv/bin/activate
|
|
pip install -r requirements.txt
|
|
export LANGEXTRACT_SERVICE_TOKEN='…'
|
|
uvicorn app.main:app --host "${LANGEXTRACT_API_HOST:-127.0.0.1}" --port "${LANGEXTRACT_API_PORT:-37141}"
|
|
```
|
|
|
|
For Ollama-backed models, set `model_id` to your tag (e.g. `gemma2:2b`), `model_url` to `http://127.0.0.1:11434`, and typically `fence_output: false`, `use_schema_constraints: false` per upstream README.
|
|
|
|
## License
|
|
|
|
This wrapper is MIT. LangExtract is Apache-2.0 (see upstream repository).
|