Skip to main content

Schemas

Define analysis tasks in natural language. Apply them at scale.

What Are Schemas?

Schemas are reusable analysis templates in plain English. Define what to extract from documents, AI applies it consistently across documents. Example: Media bias analysis schema:

Source Type

Type: Text
Extract: Primary source - government, activist, expert, or anonymous

Emotional Intensity

Type: Number (1-10 scale)
Extract: How emotionally charged is the language?

Framing

Type: Text
Extract: Issue framing - economic, security, moral, or procedural

Geographic Scope

Type: Text
Extract: Geographic scope - local, national, or international
Result: Comparison of how different outlets frame the same stories.

Creating Your First Schema

Step 1: Define Your Questions

Start with the analytical questions you want to answer:
  • What policy positions are mentioned?
  • How do different sources frame the same issue?
  • What stakeholders are involved?
  • What’s the timeline for implementation?

Step 2: Write Clear Instructions

Good instructions:

Department

Type: Text
Extract: Department or agency name

Amount (Millions)

Type: Number
Extract: Amount in millions of dollars

Change Type

Type: Text
Extract: Change from previous year - increase, decrease, or same
Avoid vague instructions:

❌ Financial Info

Type: Text
Extract: Extract important financial information
Too vague - AI won’t know what specific information to extract

Step 3: Choose Data Types

Text (string): Categories, descriptions, names, locations
Numbers: Ratings, amounts, counts, scales
Lists (array): Multiple items, themes, categories
Dates: Timestamps, deadlines, events

Data Type Examples

Text Example

Source Type
Type: Text
Extract: News source - mainstream, alternative, government, or independent

List Example

Framing Categories
Type: List
Extract: Issue framing - can include multiple: economic, security, moral, procedural

Number Example

Security Stance
Type: Number (1-10 scale)
Extract: Immigration position (1=pro-immigration, 10=security-focused)

Step 4: Test and Refine

Always test on 2-3 sample documents first:
  1. Check output quality matches expectations
  2. Identify gaps in extracted data
  3. Refine instructions based on results

How It Works

Pattern: Domain expertise → Natural language instructions → Structured JSON output Benefits:
  • No coding required
  • Reproducible results
  • Transparent methodology
  • Continuously improvable

Running Analysis

1

Select Content

Choose individual assets or entire bundles to analyse
2

Pick Schema

Select your analysis template from available schemas
3

Configure Settings

Choose AI model and processing options
4

Run Analysis

Process all selected content with your schema

AI Models

Cloud Models: OpenAI GPT, Google Gemini, Anthropic Claude
Local Models: Ollama (Llama, Mistral, Code Llama)

Best Practices

Write Specific Instructions

Good - Clear and specific:

✓ Department

Type: Text
Extract: Department name (e.g., Education, Defense, Health)
Clear examples help AI understand what you want

✓ Budget Amount

Type: Number
Extract: Budget amount in millions of dollars
Specific unit makes extraction consistent
Avoid - Too vague:

❌ Financial Info

Type: Text
Extract: Extract important financial information
AI won’t know which financial details matter to you

Start Simple, Then Expand

  1. Extract basic entities (people, organisations, locations)
  2. Add analysis layers (sentiment, categorization)
  3. Include complex reasoning (relationships, advanced analysis)

Test Before Scaling

Always test on 2-3 sample documents before processing large batches.

Sharing Schemas

Schema Library: Upload successful schemas and browse others’ work
Transparency: Others can see, critique, and replicate your methodology
Community: Builds cumulative knowledge in the field

Next Steps

1

Create Schema

Start with simple entity extraction and test on sample documents
2

Upload Content

Use Assets & Bundles to organise your documents
3

Run Analysis

Apply your schema and choose appropriate AI models
4

Explore Results

Visualise findings with Analysis Dashboards
5

Scale Up

Create complex schemas and explore Chat & MCP
I