# langextract-api

Local HTTP API on **`127.0.0.1`** wrapping [google/langextract](https://github.com/google/langextract): structured extractions from unstructured text with optional character grounding.

## Environment

| Variable | Required | Description |
|----------|----------|-------------|
| `LANGEXTRACT_SERVICE_TOKEN` | no | If set, every request must send `Authorization: Bearer <token>`. |
| `LANGEXTRACT_API_HOST` | no | Bind address (default `127.0.0.1`). |
| `LANGEXTRACT_API_PORT` | no | Port (default `37141`). |
| `LANGEXTRACT_API_KEY` | no | Used by LangExtract for cloud models (e.g. Gemini) when the client does not pass `api_key` in the JSON body. See upstream docs. |

## Endpoints

- `GET /health` — liveness.
- `POST /extract` — run extraction. JSON body matches [LangExtract](https://github.com/google/langextract) `extract()` parameters where applicable: `text`, `prompt_description`, `examples`, `model_id`, optional `model_url` (Ollama), `extraction_passes`, `max_workers`, `max_char_buffer`, `api_key`, `fence_output`, `use_schema_constraints`.

Example `examples` item:

```json
{
  "text": "ROMEO. But soft!",
  "extractions": [
    {
      "extraction_class": "character",
      "extraction_text": "ROMEO",
      "attributes": {}
    }
  ]
}
```

## Run

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
export LANGEXTRACT_SERVICE_TOKEN='…'
uvicorn app.main:app --host "${LANGEXTRACT_API_HOST:-127.0.0.1}" --port "${LANGEXTRACT_API_PORT:-37141}"
```

For Ollama-backed models, set `model_id` to your tag (e.g. `gemma2:2b`), `model_url` to `http://127.0.0.1:11434`, and typically `fence_output: false`, `use_schema_constraints: false` per upstream README.

## License

This wrapper is MIT. LangExtract is Apache-2.0 (see upstream repository).