The Process
Choosing a Model
Different models have different strengths:| Provider | Best for | Notes |
|---|---|---|
| OpenAI | General extraction, high throughput | Reliable, fast, good default |
| Anthropic | Nuanced analysis, long documents | Better at complex reasoning |
| Large context windows | Good for lengthy documents | |
| Ollama | Privacy, local processing | No data leaves your machine |
When analysing large collections
- Start small - Test on 2-3 documents first
- Check outputs - Do they match expectations?
- Refine instructions - Tighten schema if needed
- Scale up - Run on full collection
Results
After analysis completes, results appear as annotations on your assets. Each annotation contains the structured data your schema extracted. View results:- On the asset - See all annotations for a specific document
- Fragments - If curated from a table (see curation), you can see the persistent fragments on the asset detail view
- In dashboards - Aggregate and visualise across the entire run
- Via export - Download as CSV or JSON for external analysis
Related
Schemas
Define what to extract from documents
Curating Fragments
Promote good extractions to persistent metadata
Dashboards
Visualise results with charts and tables
App Setup
Configure API keys and providers